Category Archives: Computing

Detecting the Citadel Trojan with an Application Failure

An application that I do some development on was having trouble starting on a Windows XP machine.  In this application, we hook some of the Windows API functions internally to extend functionality in print preview (long story).  However, on this machine, the hook was failing.

The machine had fully up-to-date antivirus software from a major vendor; they’d run various anti-malware programs, but nothing was coming up.

No weird looking processes, DLLs or drivers were visible using Process Explorer; it looked like a pretty clean machine.  So after doing the usual diagnostics, we decided to take a full dump of the process when it threw up its error message.  A debug message told us that the hook function was failing to hook RegisterClassW, RegisterClassExW, RegisterClassA and RegisterClassExA.

I loaded the dump up in windbg and took a look at the functions.

0:000> u registerclassw
user32!RegisterClassW:
7e41a39a 6849d31300      push    13D349h
7e41a39f c3              ret
7e41a3a0 ec              in      al,dx
7e41a3a1 308b45085657    xor     byte ptr [ebx+57560845h],cl
7e41a3a7 6a09            push    9
7e41a3a9 59              pop     ecx
7e41a3aa 8d7004          lea     esi,[eax+4]
7e41a3ad 8b00            mov     eax,dword ptr [eax]

Whoa.  Push what?  That’s definitely not what we normally see!  Here’s what we’d expect to see:

USER32!RegisterClassW:
7e41a39a 8bff            mov     edi,edi
7e41a39c 55              push    ebp
7e41a39d 8bec            mov     ebp,esp
7e41a39f 83ec30          sub     esp,30h
7e41a3a2 8b4508          mov     eax,dword ptr [ebp+8]
7e41a3a5 56              push    esi
7e41a3a6 57              push    edi
7e41a3a7 6a09            push    9

So that push followed by ret is going to jump to address 13D349.  Let’s look to see which module owns that address space:

0:000> !address 13D349
Usage:
Allocation Base:        00000000
Base Address:           00130000
End Address:            00169000
Region Size:            00039000
Type:                   00000000
State:                  00000000
Protect:                00000000

So, not in the image space of a normally loaded module then.  So who is allocating that memory?  Let’s look for some strings longer than 3 characters in that block:

0:000> s -sa 00130000 00169000

This returned a stack of garbage data , but some strings stood out:

00131fb8  "Coded by BRIAN KREBS for persona"
00131fd8  "l use only. I love my job & wife"
00131ff8  "."

...

00136f08  "http://%02x%02x%02x%02x%02x%02x%"
00136f28  "02x%02x.com/%02x%02x%02x%02x/%02"
00136f48  "x%02x%02x%02x.php"

...

00138eac  "http://www.google.com/webhp"

...

00139424  "facebook.com"
00139438  "%BOTID%"
00139440  "%BOTNET%"

...

That first one was a dead giveaway.  Brian Krebs is a security researcher.  A Google search found that the Citadel Trojan embedded that string.

To help us diagnose similar situations more rapidly in the future, we captured a few other visible details on the file system and in the registry.  Images below for your entertainment!

There was not a lot of point going any further.  We understood why the problem was occurring, and how to resolve it (rebuild!).  Given that no antivirus vendors appear to support removal of this trojan, we advised that the client rebuild the computer as the safest and most cost-effective way forward.

Plotting Pretty Elevation Profiles with the Strava API

Update 13 July 2013: After the deprecation of the Strava API, this tool no longer works. However, an updated version of this segment gradient tool is now available here.

So this tool plots elevation profiles from Strava data. OK, so maybe the profiles aren’t amazingly pretty.  But I had fun making them look somewhat like the elevation profiles from a certain famous cycling event!

Lipscome + Nicholas climb

This little hacked-together piece of JavaScript will plot the elevation from a Strava segment (in metric units only, of course!) and uses the familiar green-blue-red-black styling to represent the severity of the gradient.

I wrote the code to fulfill a specific purpose: generating graphs for the Hobart 10,000 ride.  But I figured I’d make the code and tool available for anyone to use or fiddle with as they see fit.

This tool requires IE9, Firefox, Safari, Chrome or any other canvas-aware browser.  If you plug in bad data, you’ll get bad results.  So don’t.  All the parameters dynamically refresh the profile, except for the segment ID field, after which you’ll need to click Load Segment.

If you want to play with the source yourself, the only thing you need to do server-side is plug in the data from: http://www.strava.com/api/v1/stream/segments/segmentid

Go knock yourself out here: http://hobart10000.com/segment-gradient.php

Updated 15 Aug 2012: The tool now does isometric projection, which I think looks quite a lot nicer, and I’ve tidied up the user interface and added a few more controls.  As noted by Jonathan in the comments, it doesn’t do too well with downhill segments — the tool assumes it is an uphill segment at present. 

Indy, TIdURI.PathEncode, URLEncode and ParamsEncode and more

Frequently in Delphi we come across the need to encode a string to stuff into a URL query string parameter (as per web forms).  One would expect that Indy contains well-tested functions to handle this.  Well, Indy contains some functions to help with this, but they may not work quite as you expect.  In fact, they may not be much use at all.

Indy contains a component called TIdURI.  It contains, among other things, the member functions URLEncode, PathEncode, and ParamsEncode. At first glance, these seem to do what you would need.  But in fact, they don’t.

URLEncode will take a full URL, split it into path, document and query components, encode each of those, and return the full string.  PathEncode is intended to handle the nuances of the path and document components of the URL, and ParamsEncode handles query strings.

Sounds great, right?  Well, it works until you have a query parameter that has an ampersand (&) in it.  Say my beloved end user want to search for big&little.  It seems that you could pass the following in:

s := TIdURI.URLEncode('http://www.google.com/search?q='+SearchText);

But then we get no change in our result:

s = 'http://www.google.com/search?q=big&little';

And you can already see the problem: little is now a separate parameter in the query string.  How can we work around this?  Can we pre-encode ampersand to %26 before you pass in the parameters?

s := TIdURI.URLEncode('http://www.google.com/search?q='+ReplaceStr(SearchText, '&', '%26'));

No:

s = 'http://www.google.com/search?q=big%25%26little';

And obviously we can’t do it ourselves afterwards, because we too won’t know which ampersands are which.  You could do correction of ampersand by encoding each parameter component separately and then post-processing the component for ampersand and other characters before final assembly using ParamsEncode. But you’ll soon find that it’s not enough anyway.  =, / and ? are also not encoded, although they should be.  Finally, URLEncode does not support internationalized domain names (IDN).

Given that these functions are not a complete solution, it’s probably best to avoid them altogether.

The problem is analogous to the Javascript encodeURI vs encodeURIComponent issue.

So to write your own…  I haven’t found a good Delphi solution online (and I searched a bit), so here’s a function I’ve cobbled together (use at your own risk!) to encode parameter names and values. You do need to encode each component of the parameter string separately, of course.

function EncodeURIComponent(const ASrc: string): UTF8String;
const
  HexMap: UTF8String = '0123456789ABCDEF';

  function IsSafeChar(ch: Integer): Boolean;
  begin
    if (ch >= 48) and (ch <= 57) then Result := True // 0-9
    else if (ch >= 65) and (ch <= 90) then Result := True // A-Z
    else if (ch >= 97) and (ch <= 122) then Result := True // a-z
    else if (ch = 33) then Result := True // !
    else if (ch >= 39) and (ch <= 42) then Result := True // '()* 
    else if (ch >= 45) and (ch <= 46) then Result := True // -.
    else if (ch = 95) then Result := True // _
    else if (ch = 126) then Result := True // ~
    else Result := False;
  end;
var
  I, J: Integer;
  ASrcUTF8: UTF8String;
begin
  Result := '';    {Do not Localize}

  ASrcUTF8 := UTF8Encode(ASrc);
  // UTF8Encode call not strictly necessary but
  // prevents implicit conversion warning

  I := 1; J := 1;
  SetLength(Result, Length(ASrcUTF8) * 3); // space to %xx encode every byte
  while I <= Length(ASrcUTF8) do
  begin
    if IsSafeChar(Ord(ASrcUTF8[I])) then
    begin
      Result[J] := ASrcUTF8[I];
      Inc(J);
    end
    else
    begin
      Result[J] := '%';
      Result[J+1] := HexMap[(Ord(ASrcUTF8[I]) shr 4) + 1];
      Result[J+2] := HexMap[(Ord(ASrcUTF8[I]) and 15) + 1];
      Inc(J,3);
    end;
    Inc(I);
  end;

  SetLength(Result, J-1);
end;

To use this, do something like the following:

function GetAURL(const param, value: string): UTF8String;
begin
  Result := 'http://www.example.com/search?'+
    EncodeURIComponent(param)+
    '='+
    EncodeURIComponent(value);
end;

Hope this helps. Sorry, I haven’t got an IDN solution in this post!

Updated 15 Nov 2018: Fixed bug with handling of space (should output %20, not +).

Understanding and correcting interface reference leaks in Delphi’s Vcl.OleCtrls.pas

Update 21 Sep 2015: This bug has been fixed in Delphi 10 Seattle.

I have spent quite some time recently tracing a memory leak in a Delphi application.  It is quite a large application and makes a lot of use of embedded MSHTML (TWebBrowser and TEmbeddedWB) controls for presentation.  Somehow, somewhere, we were leaking memory: users were reporting that after a few hours of use, the application slowed down, and checking Task Manager certainly reflected excessive memory usage.

The normal procedure to reproduce the memory leak was followed, including using tools such as AQtime and other debug logging tools.  However, no leaks were detected using these tools, although we could see the memory usage increasing in Task Manager on our test machines.  This suggested the memory we were leaking was not allocated by Delphi code: i.e. it was Windows or a 3rd party DLL.  This doesn’t mean, of course, that it wasn’t our fault — just that it wasn’t allocated directly from Delphi source!

At this point, I was asked to trace this issue further.  I ran Performance Monitor with a couple of key counters: Handle Count and Working Set.  Running the test in question (involving opening and closing a window with an embedded web browser control) showed a gradual increase in both handle count and working set size.  However, Performance Monitor unfortunately does not include a user handle (i.e. window handles) counter.  It was when I noticed in Process Explorer that user handles were also increasing that I got my first break in tracing the cause.

It turned out that the embedded web browser window was not always being destroyed when its parent window was.  This window had the class name “Internet Explorer_Server”.

With a bit more tracing, I found that the trigger was the Document or Application properties.  If the Document property of the TWebBrowser control was ever referenced, the window was never destroyed (note, there are also some other properties that trigger the same behaviour — look for properties returning type IDispatch).

This set me to researching the Document property.  It looks like this:

property Document: IDispatch index 203 read GetIDispatchProp;

Looking at GetIDispatchProp in Vcl.OleCtrls.pas, we see the following code:

function TOleControl.GetIDispatchProp(Index: Integer): IDispatch;
var
  Temp: TVarData;
begin
  GetProperty(Index, Temp);
  Result := IDispatch(Temp.VDispatch);
end;

And here some alarm bells go off.  Delphi, rather nicely, manages all the reference counting on interfaces.  This works pretty smoothly, until you trick the compiler by casting other types to interfaces.  Here the code in question is triggering an invisible call to IntfCopy in the line:

Result := IDispatch(Temp.VDispatch);

The IntfCopy function internally calls  _AddRef on the object, but because of the cast to IDispatch from a Pointer, this new reference is never released.  The fix is to change the function (and the similar GetIUnknownProp function) to:

function TOleControl.GetIDispatchProp(Index: Integer): IDispatch;
var
  Temp: TVarData;
begin
  GetProperty(Index, Temp);
  Pointer(Result) := Temp.VDispatch;
end;

function TOleControl.GetIUnknownProp(Index: Integer): IUnknown;
var
  Temp: TVarData;
begin
  GetProperty(Index, Temp);
  Pointer(Result) := Temp.VUnknown;
end;

By casting this way, we avoid the call to IntfCopy and hence the call to _AddRef.  Alternatively, you could have called _Release on Result, but this would require another test to ensure that Result wasn’t nil (and also, of course, redundant _AddRef and _Release calls).

It turns out that this problem was identified way back in 1999 and the workaround has been commonly referenced since then.  So I am not taking credit for the fix here!  And yet it is still unresolved in Delphi XE2 — and so still causing trouble for Delphi programmers today!  There are no clear references to the problem in QualityCentral that I could find (that’s about to change!)

But don’t stop reading yet!

“Now my app crashes”

There are frequent complaints online that this fix results in crashes.  This is because other developers have engineered fixes to this reference leak in places where these GetIDispatchProp and/or GetIUnknownProp calls are made, rather than where the problem actually occurs.  I have found this in TEmbeddedWB.  TEmbeddedWB is a web browser hosting component that takes up where TWebBrowser leaves off, and it does fix a lot of the limitations of TWebBrowser.

But here are the places in the TEmbeddedWB source that you’ll need to “unfix” once you fix the root problem in Vcl.OleCtrls.pas:

EmbeddedWB.pas [2603]: (procedure TEmbeddedWB.SetUserAgentInt): Delete _Release call
EwbCore.pas [1367]: (procedure TCustomEmbeddedWB.SetDesignMode): Delete _Release call
EwbCore.pas [1404]: (procedure TCustomEmbeddedWB.SetDownloadOptions): Delete _Release call
EwbCore.pas [1468]: (procedure TCustomEmbeddedWB.SetUserInterfaceOptions): Delete _Release call
EwbTools.pas [1015]: (function GetBmpFromBrowser): Delete _Release call
EwbTools.pas [3164]: (function InvokeCMD): Delete _Release call

Note, you’ll also see an experimental fix buried — and disabled — in the TEmbeddedWB code (but without fixing the lines above), and without a whole lot of documentation as to why!

I have created a Quality Central report, along with a test case and example fix.  I also checked the Delphi VCL source, and about 30 other components that we use, and found no more calls to _Release relating to this issue.

Delphi XE2’s hidden hints and warnings options

There are a number of warnings available in Delphi XE2 that are not very well documented.  While you can control them in the Project options dialog, and you can turn them on using {$WARN} directives or in command line compiler options, the documentation for the warning identifiers is currently pretty piecemeal, and there is no clear link between the project options, warning directive identifiers, and numeric identifiers.

The {$WARN} compiler directive takes effect only in the unit it is in.  It overrides project settings and command line compiler flags.  To turn a warning on or off for a block of code (although you’ll have to work hard to convince me that this is an acceptable solution 99% of the time):

{$WARN SYMBOL_DEPRECATED ON}
{$WARN SYMBOL_DEPRECATED OFF}

To restore the warning to its previous setting:

{$WARN SYMBOL_DEPRECATED DEFAULT}

And finally, you can convert a warning into an error:

{$WARN SYMBOL_DEPRECATED ERROR}

To change the setting in your project, go to Project|Options, Delphi Compiler, Hints and Warnings, and expand the Output warnings setting:


Delphi’s project options dialog – hints and warnings

To change the setting in the command line compiler, use the -W flag, for example:

dcc32 -W+SYMBOL_DEPRECATED myproject.dpr
dcc32 -W-SYMBOL_DEPRECATED myproject.dpr

Or to turn the warning into an error:

dcc32 -W^SYMBOL_DEPRECATED myproject.dpr

Note, if you type this from the Windows command prompt, you will have to repeat the ^ symbol, as ^ is an escape symbol:

dcc32 -W^^SYMBOL_DEPRECATED myproject.dpr

These are the hints and warning identifiers I am aware, along with what I understand are the default settings and numeric identifiers, and links to the relevant help topics.

ID Directive ID Default Description Notes
W1000 SYMBOL_DEPRECATED OFF Symbol ‘%s’ is deprecated (Delphi)
W1001 SYMBOL_LIBRARY OFF Symbol ‘%s’ is specific to a library (Delphi)
W1002 SYMBOL_PLATFORM OFF Symbol ‘%s’ is specific to a platform (Delphi)
W1003 SYMBOL_EXPERIMENTAL ON Symbol ‘%s’ is experimental (Delphi)
W1004 UNIT_LIBRARY OFF Unit ‘%s’ is specific to a library (Delphi)
W1005 UNIT_PLATFORM OFF Unit ‘%s’ is specific to a platform (Delphi)
W1006 UNIT_DEPRECATED OFF Unit ‘%s’ is deprecated (Delphi)
W1007 UNIT_EXPERIMENTAL ON Unit ‘%s’ is experimental (Delphi)
W1008 HRESULT_COMPAT ON Integer and HRESULT interchanged
W1009 HIDING_MEMBER ON Redeclaration of ‘%s’ hides a member in the base class (Delphi)
W1010 HIDDEN_VIRTUAL ON Method ‘%s’ hides virtual method of base type ‘%s’ (Delphi)
W1011 GARBAGE ON Text after final ‘END.’ – ignored by compiler (Delphi)
W1012 BOUNDS_ERROR ON Constant expression violates subrange bounds
W1013 ZERO_NIL_COMPAT ON Constant 0 converted to NIL (Delphi)
W1014 STRING_CONST_TRUNCED ON String constant truncated to fit STRING%ld (Delphi)
W1015 FOR_LOOP_VAR_VARPAR ON FOR-Loop variable ‘%s’ cannot be passed as var parameter (Delphi)
W1016 TYPED_CONST_VARPAR ON Typed constant ‘%s’ passed as var parameter (Delphi)
W1017 ASG_TO_TYPED_CONST ON Assignment to typed constant ‘%s’ (Delphi)
W1018 CASE_LABEL_RANGE ON Case label outside of range of case expression (Delphi)
x1019 FOR_VARIABLE ON For loop control variable must be simple local variable (Delphi)
x1020 CONSTRUCTING_ABSTRACT ON Constructing instance of ‘%s’ containing abstract method ‘%s.%s’ (Delphi)
W1021 COMPARISON_FALSE ON Comparison always evaluates to False (Delphi)
W1022 COMPARISON_TRUE ON Comparison always evaluates to True (Delphi)
W1023 COMPARING_SIGNED_UNSIGNED ON Comparing signed and unsigned types – widened both operands (Delphi)
W1024 COMBINING_SIGNED_UNSIGNED ON Combining signed and unsigned types – widened both operands (Delphi)
x1025 UNSUPPORTED_CONSTRUCT ON Unsupported language feature: ‘%s’ (Delphi)
x1026 FILE_OPEN ON File not found ‘%s’ (Delphi) Despite being listed as a warning, turning this off appears to have no effect.
F1027 FILE_OPEN_UNITSRC ON Unit not found ‘%s’ or binary equivalents (%s) (Delphi)
x1028 BAD_GLOBAL_SYMBOL ON Bad global symbol definition ‘%s’ in object file ‘%s’ (Delphi)
W1029 DUPLICATE_CTOR_DTOR ON Duplicate %s ‘%s’ with identical parameters will be inacessible from C++ (Delphi)
x1030 INVALID_DIRECTIVE ON Invalid compiler directive – ‘%s’ (Delphi)
W1031 PACKAGE_NO_LINK ON Package ‘%s’ will not be written to disk because -J option is enabled (Delphi)
W1032 PACKAGED_THREADVAR ON Exported package threadvar ‘%s.%s’ cannot be used outside of this package (Delphi)
W1033 IMPLICIT_IMPORT ON Unit ‘%s’ implicitly imported into package ‘%s’ (Delphi)
W1034 HPPEMIT_IGNORED ON $HPPEMIT ‘%s’ ignored (Delphi)
W1035 NO_RETVAL ON Return value of function ‘%s’ might be undefined (Delphi)
W1036 USE_BEFORE_DEF ON Variable ‘%s’ might not have been initialized (Delphi)
W1037 FOR_LOOP_VAR_UNDEF ON FOR-Loop variable ‘%s’ may be undefined after loop (Delphi)
E1038 UNIT_NAME_MISMATCH ON Unit identifier ‘%s’ does not match file name (Delphi)
W1039 NO_CFG_FILE_FOUND ON No configuration files found (Delphi)
W1040 IMPLICIT_VARIANTS ON Implicit use of Variants unit (Delphi)
W1041 UNICODE_TO_LOCALE ON Error converting Unicode char to locale charset. String truncated. Is your LANG environment variable set correctly (Delphi)
W1042 LOCALE_TO_UNICODE ON Error converting locale string ‘%s’ to Unicode. String truncated. Is your LANG environment variable set correctly (Delphi)
W1043 IMAGEBASE_MULTIPLE ON Imagebase $%X is not a multiple of 64k. Rounding down to $%X (Delphi)
W1044 SUSPICIOUS_TYPECAST ON Suspicious typecast of %s to %s (Delphi)
W1045 PRIVATE_PROPACCESSOR ON Property declaration references ancestor private ‘%s.%s’ (Delphi)
W1046 UNSAFE_TYPE OFF Unsafe type ‘%s%s%s’ (Delphi)
W1047 UNSAFE_CODE OFF Unsafe code ‘%s’ (Delphi)
W1048 UNSAFE_CAST OFF Unsafe typecast of ‘%s’ to ‘%s’ (Delphi)
W1049 OPTION_TRUNCATED ON value ‘%s’ for option %s was truncated (Delphi)
W1050 WIDECHAR_REDUCED ON WideChar reduced to byte char in set expressions (Delphi)
W1051 DUPLICATES_IGNORED ON Duplicate symbol names in namespace. Using ‘%s.%s’ found in %s. Ignoring duplicate in %s (Delphi)
W1052 UNIT_INIT_SEQ ON Can’t find System.Runtime.CompilerServices.RunClassConstructor. Unit initialization order will not follow uses clause order Does not seem to be documented in XE2
W1053 LOCAL_PINVOKE ON Local PInvoke code has not been made because external routine ‘%s’ in package ‘%s’ is defined using package local types in its custom attributes Does not seem to be documented in XE2
x1054 MESSAGE_DIRECTIVE ON %s (Delphi) User-defined warning messages (see below). Turns off message hints as well but not message errors.
W1055 TYPEINFO_IMPLICITLY_ADDED ON Published caused RTTI ($M+) to be added to type ‘%s’ (Delphi)
x1056 RLINK_WARNING ON Duplicate resource Type %s, ID %s; File %s resource kept; file %s resource discarded (Delphi)
W1057 IMPLICIT_STRING_CAST ON Implicit string cast from ‘%s’ to ‘%s’ (Delphi)
W1058 IMPLICIT_STRING_CAST_LOSS ON Implicit string cast with potential data loss from ‘%s’ to ‘%s’ (Delphi)
W1059 EXPLICIT_STRING_CAST OFF Explicit string cast from ‘%s’ to ‘%s’ (Delphi)
W1060 EXPLICIT_STRING_CAST_LOSS OFF Explicit string cast with potential data loss from ‘%s’ to ‘%s’ (Delphi)
W1061 CVT_WCHAR_TO_ACHAR ON W1061 Narrowing given WideChar constant (‘%s’) to AnsiChar lost information (Delphi)
W1062 CVT_NARROWING_STRING_LOST ON Narrowing given wide string constant lost information (Delphi)
W1063 CVT_ACHAR_TO_WCHAR ON Widening given AnsiChar constant (‘%s’) to WideChar lost information (Delphi)
W1064 CVT_WIDENING_STRING_LOST ON Widening given AnsiString constant lost information (Delphi)
W1066 NON_PORTABLE_TYPECAST ON Lost Extended floating point precision. Reduced to Double (Delphi)
W1201 XML_WHITESPACE_NOT_ALLOWED ON XML comment on ‘%s’ has badly formed XML-‘Whitespace is not allowed at this location.’ (Delphi)
W1202 XML_UNKNOWN_ENTITY ON XML comment on ‘%s’ has badly formed XML- ‘Reference to undefined entity ‘%s (Delphi)
W1203 XML_INVALID_NAME_START ON XML comment on ‘%s’ has badly formed XML-‘A name was started with an invalid character.’ (Delphi)
W1204 XML_INVALID_NAME ON XML comment on ‘%s’ has badly formed XML-‘A name contained an invalid character.’ (Delphi)
W1205 XML_EXPECTED_CHARACTER ON XML comment on ‘%s’ has badly formed XML-‘The character ‘%c’ was expected.’ (Delphi)
W1206 XML_CREF_NO_RESOLVE ON XML comment on ‘%s’ has cref attribute ‘%s’ that could not be resolved (Delphi)
W1207 XML_NO_PARM ON XML comment on ‘%s’ has a param tag for ‘%s’, but there is no parameter by that name (Delphi)
W1208 XML_NO_MATCHING_PARM ON Parameter ‘%s’ has no matching param tag in the XML comment for ‘%s’ (but other parameters do) (Delphi)

The $MESSAGE directive allows generation of H1054, W1054, E1054 and F1054 messages.  User-defined hint and warning messages can be turned off with {$WARN MESSAGE_DIRECTIVE OFF} but logically enough, user-defined error and fatal messages can not be disabled.  For example, add the following line to MyUnit.pas:

{$MESSAGE WARN 'This is a user warning'}

Which will gives a compiler warning similar to the following:

[DCC Warning] MyUnit.pas(25): W1054 This is a user warning

Finally, I’m not going to discuss the $WARNINGS directive, except to say that I don’t think it’s ever necessary or sensible to turn it off.  If you must turn a warning off for a specific section of code, turn just that warning off as discussed above.  Using {$WARNINGS OFF} is just as unhelpful as an empty except block.

Updates to kmwString functions

A couple of weeks ago I published on this blog a set of functions for handling supplementary characters in JavaScript.  We have since fixed a few bugs in the code and have now made the kmwString script available on github at https://github.com/tavultesoft/keyman-tools

This update:

  • Fixes an error which could occur when only 1 parameter was passed to kmwSubstring and that parameter was negative
  • Fixed an incorrect NaN return value for kmwCodePointToCodeUnit when passed an offset at the very end of a string
  • Adds kmwCodeUnitToCodePoint function
  • Adds kmwCharAt function — which is basically just kmwSubstr(index,1)

Again, suggestions, comments, fixes and flames all gratefully received.

WinDBG and Delphi exceptions in x64

I recently was asked whether the Delphi exception event filter for WinDBG that I wrote about would also work with x64 Delphi applications.  The answer was no, it wouldn’t work, but that made me curious to find out what was different with x64.  I knew x64 exception handling was completely different to x86, being table based instead of stack based, but I wasn’t sure how much of this would be reflected in the event filter.

The original post contains the details about how the exception record was accessible at a known location on the stack, and how we could dig in from there.

Before firing up WinDBG, I had a look at System.pas, and found the x64 virtual method table offsets.  I have highlighted the key field we want to pull out:

{ Virtual method table entries }
{$IF defined(CPUX64)}
vmtSelfPtr           = -176;
vmtIntfTable         = -168;
vmtAutoTable         = -160;
vmtInitTable         = -152;
vmtTypeInfo          = -144;
vmtFieldTable        = -136;
vmtMethodTable       = -128;
vmtDynamicTable      = -120;
vmtClassName         = -112;
vmtInstanceSize      = -104;
vmtParent            = -96;
vmtEquals            = -88 deprecated 'Use VMTOFFSET in asm code';
vmtGetHashCode       = -80 deprecated 'Use VMTOFFSET in asm code';
vmtToString          = -72 deprecated 'Use VMTOFFSET in asm code';
vmtSafeCallException = -64 deprecated 'Use VMTOFFSET in asm code';
vmtAfterConstruction = -56 deprecated 'Use VMTOFFSET in asm code';
vmtBeforeDestruction = -48 deprecated 'Use VMTOFFSET in asm code';
vmtDispatch          = -40 deprecated 'Use VMTOFFSET in asm code';
vmtDefaultHandler    = -32 deprecated 'Use VMTOFFSET in asm code';
vmtNewInstance       = -24 deprecated 'Use VMTOFFSET in asm code';
vmtFreeInstance      = -16 deprecated 'Use VMTOFFSET in asm code';
vmtDestroy           =  -8 deprecated 'Use VMTOFFSET in asm code';

vmtQueryInterface    =  0 deprecated 'Use VMTOFFSET in asm code';
vmtAddRef            =  8 deprecated 'Use VMTOFFSET in asm code';
vmtRelease           = 16 deprecated 'Use VMTOFFSET in asm code';
vmtCreateObject      = 24 deprecated 'Use VMTOFFSET in asm code';
{$ELSE !CPUX64}
vmtSelfPtr           = -88;
vmtIntfTable         = -84;
vmtAutoTable         = -80;
vmtInitTable         = -76;
vmtTypeInfo          = -72;
vmtFieldTable        = -68;
vmtMethodTable       = -64;
vmtDynamicTable      = -60;
vmtClassName         = -56;
vmtInstanceSize      = -52;
vmtParent            = -48;
vmtEquals            = -44 deprecated 'Use VMTOFFSET in asm code';
vmtGetHashCode       = -40 deprecated 'Use VMTOFFSET in asm code';
vmtToString          = -36 deprecated 'Use VMTOFFSET in asm code';
vmtSafeCallException = -32 deprecated 'Use VMTOFFSET in asm code';
vmtAfterConstruction = -28 deprecated 'Use VMTOFFSET in asm code';
vmtBeforeDestruction = -24 deprecated 'Use VMTOFFSET in asm code';
vmtDispatch          = -20 deprecated 'Use VMTOFFSET in asm code';
vmtDefaultHandler    = -16 deprecated 'Use VMTOFFSET in asm code';
vmtNewInstance       = -12 deprecated 'Use VMTOFFSET in asm code';
vmtFreeInstance      = -8 deprecated 'Use VMTOFFSET in asm code';
vmtDestroy           = -4 deprecated 'Use VMTOFFSET in asm code';

vmtQueryInterface    = 0 deprecated 'Use VMTOFFSET in asm code';
vmtAddRef            = 4 deprecated 'Use VMTOFFSET in asm code';
vmtRelease           = 8 deprecated 'Use VMTOFFSET in asm code';
vmtCreateObject      = 12 deprecated 'Use VMTOFFSET in asm code';
{$IFEND !CPUX64}

I also noted that the exception code for Delphi x64 was the same as x86:

cDelphiException   = $0EEDFADE;

Given this, I put together a test x64 application in Delphi that would throw an exception, and loaded it into WinDBG.  I enabled the event filter for unknown exceptions, and triggered an exception in the test application.  This broke into WinDBG, where I was able to take a look at the raw stack:

(2ad4.2948): Unknown exception - code 0eedfade (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
KERNELBASE!RaiseException+0x39:
000007fe`fd6ccacd 4881c4c8000000  add     rsp,0C8h
0:000> dd rbp
00000000`0012eab0  00000008 00000000 00000021 00000000
00000000`0012eac0  0059e1f0 00000000 0059e1f0 00000000
00000000`0012ead0  0eedfade 00000001 00000000 00000000
00000000`0012eae0  0059e1dd 00000000 00000007 00000000
00000000`0012eaf0  0059e1dd 00000000 0256cff0 00000000
00000000`0012eb00  00000000 00000000 00000000 00000000
00000000`0012eb10  00000000 00000000 00000000 00000000
00000000`0012eb20  00000000 00000000 0256cff8 00000000

We can see at rbp+20 is the familiar looking 0EEDFADE value.  This is the start of the EXCEPTION_RECORD structure, which I’ve reproduced below from Delphi’s System.pas with a little annotation of my own:

  TExceptionRecord = record
    ExceptionCode: Cardinal;                 // +0
    ExceptionFlags: Cardinal;                // +4
    ExceptionRecord: PExceptionRecord;       // +8
    ExceptionAddress: Pointer;               // +10
    NumberParameters: Cardinal;              // +18
    case {IsOsException:} Boolean of
      True:  (ExceptionInformation : array [0..14] of NativeUInt);
      False: (ExceptAddr: Pointer;           // +20
              ExceptObject: Pointer);        // +28
  end;

We do have to watch out for member alignment with this structure — because it contains both 4 byte DWORDs and 8 byte pointers, there are 4 bytes of hidden padding after the NumberParameters member, as shown below (this is from MSDN, sorry to switch languages on you!):

typedef struct _EXCEPTION_RECORD64 {
  DWORD ExceptionCode;
  DWORD ExceptionFlags;
  DWORD64 ExceptionRecord;
  DWORD64 ExceptionAddress;
  DWORD NumberParameters;
  DWORD __unusedAlignment;
  DWORD64 ExceptionInformation[EXCEPTION_MAXIMUM_PARAMETERS];
} EXCEPTION_RECORD64, *PEXCEPTION_RECORD64;

But what we can see from TExceptionRecord is that at offset 0x28 in the record is a pointer to our ExceptObject.  Great!  That’s everything we need.  We can now put together our x64-modified event filter.

You may remember the x86 event filter:

sxe -c "da poi(poi(poi(ebp+1c))-38)+1 L16;du /c 100 poi(poi(ebp+1c)+4)" 0EEDFADE

So here is the x64 version:

sxe -c "da poi(poi(poi(rbp+48))-70)+1 L16;du /c 100 poi(poi(rbp+48)+8)" 0EEDFADE

And with this filter installed, here is how a Delphi exception is now displayed in WinDBG:

(2ad4.2948): Unknown exception - code 0eedfade (first chance)
00000000`0059e0cf  "MyException"
00000000`02573910  "My very own kind of error message"
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
KERNELBASE!RaiseException+0x39:
000007fe`fd6ccacd 4881c4c8000000  add     rsp,0C8h

I’ll dissect the pointer offsets a little more than I did in the previous blog, because they can be a bit confusing:

  • rbp+48 is a pointer to the exception object (usually a type that inherits from Exception).
  • poi(rbp+48) dereferences that, and at offset 0 right here, we have a pointer to the class type.
  • Before we look at the class type, poi(rbp+48)+8 is the first member of the object (don’t forget ancestor classes), which happens to be FMessage from the Exception class. That gives us our message.
  • Diving deeper, poi(poi(rbp+48)) is now looking at the class type.
  • And we know that the offset of vmtClassName is -112 (-0x70).  So poi(poi(poi(rbp+48))-70) gives us the the ShortString class name, of which the first byte is the length.
  • So we finish with poi(poi(poi(rbp+48))-70)+1, which lets us look at the string itself.

You will see that to access the exception message, I have opted to look directly at the Exception object rather than use the more direct pointer which is on the stack.  I did this to make it easier to see how it might be possible to pull out other members of descendent exception classes, such as ErrorCode from EOSError.

And one final note: looking back on that previous blog, I see that one thing I wrote was a little misleading: the string length of FMessage is indeed available at poi(poi(rbp+48)+8)-4, but the string is null-terminated, so we don’t need to use it — WinDBG understands null-terminated strings.  Where this is more of a problem is with the ShortString type, which is not null-terminated.  This is why sometimes exception class names displayed using this method will show a few garbage characters afterwards, because we don’t bother about accounting for that; the L16 parameter prevents us dumping memory until we reach a null byte.

A more UTF-32 aware JavaScript String library

One of the hassles I regularly experience with JavaScript is that it does not have native support for supplementary characters.  Internally, JavaScript uses UCS-2 encoding (unlike most of the rest of the web, which uses UTF-8…)  While you can use surrogate pairs to represent Unicode characters between U+10000 and U+10FFFF, this makes string handling with these characters a pain.  In particular, functions such as indexOf and substr have to be very carefully used, both to account for the surrogate pairs in their index parameters, and to avoid cutting them in half when manipulating the string. Of course, when interfacing with third party services, you will need to be aware of how they handle text.  For instance, the Twitter 140 character limit counts Unicode code points, not UCS-2 code units.  But many other products, (for example, Microsoft SQL Server), use UTF-16 or UCS-2 internally and treat supplementary plane characters as 2 code units for the purposes of calculating field size.  Developer beware! Anyway, I have put together a small set of functions that treat surrogate pairs as a single code point, abstracting away surrogate pairs at the basic String level.  Adding support for surrogate pairs is not a complete solution — I haven’t done any work on regular expressions, for example, and this code also does not begin to address more complex requirements around grapheme clusters or normalisation, but this is just one less complexity to worry about. The functions do not replace any existing String functions. This code is not complete: I am missing some boundary conditions and edge cases, and I haven’t yet tested with isolated surrogate code units — but for what it’s worth, here it is.  The kmw prefix refers to KeymanWeb, which will shortly be using the functions (replacing the mishmash of code we currently use…) Some simple examples:

var str="Brave "+String.kmwFromCharCode(0x13027, 0x1314C, 0x1309C)+" world";

alert(str.indexOf("w"));    // Displays 13
alert(str.kmwIndexOf("w")); // Displays 10

alert(str.length);       // Displays 18
alert(str.kmwLength());  // Displays 15

alert(str.kmwSubstr(4,3));  // Displays e U+13027
alert(str.substr(4,3));  // Displays e U+D80C (half a supplementary pair!)

The license on this code is Mozilla Public License 1.1.  A couple of functions were lifted from the proposal Supplementary Characters for ECMAScript by Norbert Lindenberg and tweaked (back) to more closely mimic the functions they replace, warts and all.  These two functions are probably better tested than my ones! Version 1.0 of this library plus a rudimentary test script can be downloaded from http://durdin.net/blog-files/kmwString-0.1.zip. Comments, bug fixes, flames, suggestions much appreciated!

/**
  @preserve (C) 2012 Tavultesoft Pty Ltd
  
  Adds functions to treat supplementary plane characters in the same 
  way as basic multilingual plane characters in JavaScript.
  
  Version 0.1
  
  License
  
  The contents of this file are subject to the Mozilla Public License
  Version 1.1 (the "License"); you may not use this file except in
  compliance with the License. You may obtain a copy of the License at
  http://www.mozilla.org/MPL/

  Software distributed under the License is distributed on an "AS IS"
  basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the
  License for the specific language governing rights and limitations
  under the License.

  The Original Code is (C) 2012 Tavultesoft Pty Ltd.

  The Initial Developer of the Original Code is Tavultesoft.
*/

/**
 * Constructs a string from one or more Unicode character codepoint values 
 * passed as integer parameters.
 * 
 * @param  {integer} cp0,...   1 or more Unicode codepoints, e.g. 0x0065, 0x10000
 * @return {String}            The new String object.
 */
String.kmwFromCharCode = function() {
  var chars = [], i;
  for (i = 0; i < arguments.length; i++) {
    var c = Number(arguments[i]);
	if (!isFinite(c) || c < 0 || c > 0x10FFFF || Math.floor(c) !== c) {
	  throw new RangeError("Invalid code point " + c);
	}
	if (c < 0x10000) {
	  chars.push(c);
	} else {
	  c -= 0x10000;
	  chars.push((c >> 10) + 0xD800);
	  chars.push((c % 0x400) + 0xDC00);
	}
  }
  return String.fromCharCode.apply(undefined, chars);
}

/**
 * Returns a number indicating the Unicode value of the character at the given 
 * code point index, with support for supplementary plane characters.
 * 
 * @param  {integer} codePointIndex  The code point index into the string (not 
                                     the code unit index) to return
 * @return {integer}                 The Unicode character value
 */
String.prototype.kmwCharCodeAt = function(codePointIndex) {
  var str = String(this);
  var codeUnitIndex = 0;
  
  if (codePointIndex < 0 || codePointIndex  >= str.length) {
    return NaN;
  }

  for(var i = 0; i < codePointIndex; i++) {
    codeUnitIndex = str.kmwNextChar(codeUnitIndex);
	if(codeUnitIndex == undefined) return NaN;
  }
  
  var first = str.charCodeAt(codeUnitIndex);
  if (first >= 0xD800 && first <= 0xDBFF && str.length > codeUnitIndex + 1) {
    var second = str.charCodeAt(codeUnitIndex + 1);
	if (second >= 0xDC00 && second <= 0xDFFF) {
	  return ((first - 0xD800) << 10) + (second - 0xDC00) + 0x10000;
	}
  }
  return first;  
}

/**
 * Returns the code point index within the calling String object of the first occurrence
 * of the specified value, or -1 if not found.
 * 
 * @param  {string}  searchValue    The value to search for
 * @param  {integer} fromIndex      Optional code point index to start searching from
 * @return {integer}                The code point index of the specified search value
 */
String.prototype.kmwIndexOf = function(searchValue, fromIndex) {
  var str = String(this);
  var codeUnitIndex = str.indexOf(searchValue, fromIndex);
  
  if(codeUnitIndex < 0) {
    return codeUnitIndex;
  }
  
  var codePointIndex = 0;
  for(var i = 0; i < codeUnitIndex; i = str.kmwNextChar(i), codePointIndex++);
  return codePointIndex;
}

/**
 * Returns the code point index within the calling String object of the last occurrence 
 * of the specified value, or -1 if not found.
 * 
 * @param  {string}  searchValue    The value to search for
 * @param  {integer} fromIndex      Optional code point index to start searching from
 * @return {integer}                The code point index of the specified search value
 */
String.prototype.kmwLastIndexOf = function(searchValue, fromIndex)
{
  var str = String(this);
  var codeUnitIndex = str.lastIndexOf(searchValue, fromIndex);
  
  if(codeUnitIndex < 0) {
    return codeUnitIndex;
  }
  
  var codePointIndex = 0;
  for(var i = 0; i < codeUnitIndex; i = str.kmwNextChar(i), codePointIndex++);
  return codePointIndex;
}

/**
 * Returns the length of the string in code points, as opposed to code units.
 * 
 * @return {integer}                The length of the string in code points
 */
String.prototype.kmwLength = function() {
  var str = String(this);
  
  if(str.length == 0) {
    return 0;
  }
  
  for(var i = 0, codeUnitIndex = 0; codeUnitIndex != undefined; i++, 
    codeUnitIndex = str.kmwNextChar(codeUnitIndex));
  return i;
}

/**
 * Extracts a section of a string and returns a new string.
 * 
 * @param  {integer} beginSlice    The start code point index in the string to 
 *                                 extract from
 * @param  {integer} endSlice      Optional end code point index in the string
 *                                 to extract to
 * @return {string}                The substring as selected by beginSlice and
 *                                 endSlice
 */
String.prototype.kmwSlice = function(beginSlice, endSlice) {
  var str = String(this);
  var beginSliceCodeUnit = str.kmwCodePointToCodeUnit(beginSlice);
  var endSliceCodeUnit = str.kmwCodePointToCodeUnit(endSlice);
  return str.slice(beginSliceCodeUnit, endSliceCodeUnit);
}

/**
 * Returns the characters in a string beginning at the specified location through
 * the specified number of characters.
 * 
 * @param  {integer} start         The start code point index in the string to 
 *                                 extract from
 * @param  {integer} length        Optional length to extract
 * @return {string}                The substring as selected by start and length
 */
String.prototype.kmwSubstr = function(start, length)
{
  var str = String(this);
  if(start < 0)
  {
    start = str.kmwLength() + start;
	if(start < 0) {
	  start = 0;
	}
  }
  var startCodeUnit = str.kmwCodePointToCodeUnit(start);
  var endCodeUnit = startCodeUnit;
  
  if(length == undefined) {
    endCodeUnit = str.length;
  } else {
    for(var i = 0; i < length; i++, endCodeUnit = str.kmwNextChar(endCodeUnit));
  }

  return str.substring(startCodeUnit, endCodeUnit);
}

/**
 * Returns the characters in a string between two indexes into the string.
 * 
 * @param  {integer} indexA        The start code point index in the string to 
 *                                 extract from
 * @param  {integer} indexB        The end code point index in the string to 
 *                                 extract to
 * @return {string}                The substring as selected by indexA and indexB
 */
String.prototype.kmwSubstring = function(indexA, indexB)
{
  var str = String(this);
  
  if(indexA > indexB) { var c = indexA; indexA = indexB; indexB = c; }
  
  var indexACodeUnit = str.kmwCodePointToCodeUnit(indexA);
  var indexBCodeUnit = str.kmwCodePointToCodeUnit(indexB);
  if(isNaN(indexBCodeUnit)) indexBCodeUnit = str.length;

  return str.substring(indexACodeUnit, indexBCodeUnit);
}

/*
  Helper functions
*/

/**
 * Returns the code unit index for the next code point in the string, accounting for
 * supplementary pairs 
 *
 * @param  {integer} codeUnitIndex   The code unit position to increment
 * @return {integer}                 The index of the next code point in the string,
 *                                   in code units
*/
String.prototype.kmwNextChar = function(codeUnitIndex) {
  var str = String(this);
  
  if(codeUnitIndex < 0 || codeUnitIndex >= str.length - 1) {
    return undefined;
  }
  
  var first = str.charCodeAt(codeUnitIndex);
  if (first >= 0xD800 && first <= 0xDBFF && str.length > codeUnitIndex + 1) {
    var second = str.charCodeAt(codeUnitIndex + 1);
	if (second >= 0xDC00 && second <= 0xDFFF) {
	  if(codeUnitIndex == str.length - 2) {
	    return undefined;
	  }
	  return codeUnitIndex + 2;
	}
  }
  return codeUnitIndex + 1;
}

/**
 * Returns the code unit index for the previous code point in the string, accounting
 * for supplementary pairs 
 *
 * @param  {integer} codeUnitIndex   The code unit position to decrement
 * @return {integer}                 The index of the previous code point in the
 *                                   string, in code units
*/
String.prototype.kmwPrevChar = function(codeUnitIndex) {
  var str = String(this);

  if(codeUnitIndex <= 0 || codeUnitIndex > str.length) {
    return undefined;
  }
  
  var second = str.charCodeAt(codeUnitIndex - 1);
  if (second >= 0xDC00 && first <= 0xDFFF && codeUnitIndex > 1) {
    var first = str.charCodeAt(codeUnitIndex - 2);
	if (first >= 0xD800 && second <= 0xDBFF) {
	  return codeUnitIndex - 2;
	}
  }
  return codeUnitIndex - 1;
}

/**
 * Returns the corresponding code unit index to the code point index passed
 *
 * @param  {integer} codePointIndex  A code point index in the string
 * @return {integer}                 The corresponding code unit index
*/
String.prototype.kmwCodePointToCodeUnit = function(codePointIndex) {
  var str = String(this);
  
  var codeUnitIndex = 0;

  if(codePointIndex < 0) {
    codeUnitIndex = str.length;
    for(var i = 0; i > codePointIndex; i--, 
	  codeUnitIndex = str.kmwPrevChar(codeUnitIndex));	
    return codeUnitIndex;
  }

  for(var i = 0; i < codePointIndex; i++,
    codeUnitIndex = str.kmwNextChar(codeUnitIndex));
  return codeUnitIndex;
}

Updated 11 May 2012: Removed script formatting code as site hosting it was unreliable. Back to good old <pre> for now.
Updated 29 May 2014: Fixed broken script text, damaged when inserted previously.

More on the pngimage unit in Delphi XE2

After submitting the issue reported in How not to do exception classes in a Delphi library to Embarcadero (QC102796) they asked me for a test case.  Didn’t make much sense to me but I duly wrote one and am about to submit it.  However, in the process I collected a lot more information about work that should have been done on the Vcl.Imaging.pngimage unit when it was integrated into the Delphi source. Here’s the full list of the exception classes and what should be done with them.

  • EPNGError should inherit from EInvalidGraphicOperation or be eliminated as it raises inappropriate exceptions
  • EPngUnexpectedEnd should inherit from EInvalidGraphic
  • EPngInvalidCRC should inherit from EInvalidGraphic
  • EPngInvalidIHDR should inherit from EInvalidGraphic
  • EPNGMissingMultipleIDAT should inherit from EInvalidGraphic
  • EPNGZLIBError should inherit from EInvalidGraphic
  • EPNGInvalidPalette should inherit from EInvalidGraphic
  • EPNGInvalidFileHeader should inherit from EInvalidGraphic
  • EPNGInvalidFileHeader should inherit from EInvalidGraphic
  • EPNGSizeExceeds should inherit from EInvalidGraphic
  • EPNGMissingPalette is unused and should be deleted
  • EPNGUnknownCriticalChunk should inherit from EInvalidGraphic
  • EPNGUnknownCompression should inherit from EInvalidGraphic
  • EPNGUnknownInterlace should inherit from EInvalidGraphic
  • EPNGNoImageData should inherit from EInvalidGraphic
  • EPNGHeaderNotPresent should inherit from EInvalidGraphic

The following three exceptions are really reimplementations of existing exception classes and should be abolished entirely:

  • EPNGOutMemory should just really be replaced with RaiseLastOSError.  Or, if backward compatibility for this exception is really needed then inherit from EOutOfMemory
  • EPNGNotExists is not used unless USEDELPHI is not defined, which it is…
  • EPNGCouldNotLoadResource masks a number of other exceptions and should if possible be removed entirely.  The code in question re-raises an exception and buries the original error with an entirely unhelpful secondary exception (and what is that exit; doing in there?).
  try 
    ResStream := TResourceStream.Create(Instance, Name, RT_RCDATA); 
  except
    RaiseError(EPNGCouldNotLoadResource, EPNGCouldNotLoadResourceText);   
    exit; 
  end;

Finally, these three need to inherit from EInvalidGraphicOperation:

  • EPNGCannotChangeTransparent should inherit from EInvalidGraphicOperation
  • EPNGInvalidNewSize should inherit from EInvalidGraphicOperation
  • EPNGInvalidSpec should inherit from EInvalidGraphicOperation

In a number of locations EPNGError and Exception are also raised (albeit sometimes in code that is $ifdef‘d out of use), often with blank error messages!  For example:

    case BitDepth of 
      {Only supported by COLOR_PALETTE}
      1: DetectPixelFormat := pf1bit; 
      2, 4: DetectPixelFormat := pf4bit; 
      {8 may be palette or r, g, b values} 
      8, 16: 
        case ColorType of 
          COLOR_RGB, COLOR_GRAYSCALE: DetectPixelFormat := pf24bit; 
          COLOR_PALETTE: DetectPixelFormat := pf8bit; 
          else raise Exception.Create('');
        end {case ColorFormat of} 
        else raise Exception.Create('');
    end {case BitDepth of}

Finally, the unit reimplements a bunch of standard Delphi VCL functionality that is inappropriate to keep within the Delphi VCL codebase, including core classes such as TStream.  While these are not compiled in because they surrounded by $ifdef statements, all this code should really be stripped out for maintenance and code cleanliness/readability reasons.  Thus, all code surrounded by $ifndef UseDelphi should be deleted. This is not a systematic review of the issues in the Vcl.Imaging.pngimage unit; it’s more just a rant and giving a heads up about the quality of this unit.  Embarcadero, please clean this unit up!

Further analysis on the TrustedInstaller.exe memory leaks

You may have read my last 2 blog posts about the TrustedInstaller.exe leak that seems to be present in Windows 7 (Does Microsoft’s TrustedInstaller.exe have a leak? and Reproducing the TrustedInstaller.exe leak).  It was not yet clear to me whether this leak that I observed was a bug or if it is by design, so I decided that the next time I had to build a Windows VM and install updates, I would capture some more log information about the update.  I thought about using Process Monitor to capture everything, and this would certainly have captured reams of very interesting data, but this would have slowed the update install process down unacceptably on the day.

So instead, I ran the Handle program, also a part of the SysInternals suite, once a minute, and dumped its output to a text file.  The update process ran as normal, and the handles logging generated a 50mb text file, which was fairly simple to import into a database.  Now I had some data I could play with!

I picked a filename at random to look at in the logs.  I closed my eyes and pointed, and the lucky filename chosen was winlogon!  A simple search in the database came up with 4 different objects matching winlogon in the handle list:

C:\Windows\winsxs\amd64_microsoft-windows-winlogon_31bf3856ad364e35_6.1.7600.16447_none_cbe534e7ee8042ad

C:\Windows\winsxs\amd64_microsoft-windows-winlogon_31bf3856ad364e35_6.1.7600.16447_none_cbe534e7ee8042ad\winlogon.exe

C:\Windows\winsxs\Temp\PendingRenames\0859c097e7b9cc0153330000d0088004.amd64_microsoft-windows-winlogon_31bf3856ad364e35_6.1.7600.16447_none_cbe534e7ee8042ad_winlogon.exe_ac37d0c5

C:\Windows\winsxs\Temp\PendingRenames\aef6bd97e7b9cc0152330000d0088004.amd64_microsoft-windows-winlogon_31bf3856ad364e35_6.1.7600.16447_none_cbe534e7ee8042ad.manifest

The first entry in this list was a directory, in the Windows component store (aka Side-by-Side Store).  The second was a reference to winlogon.exe within this same folder.  The third and fourth entries gave a bit of a clue as to what may be going on, with the PendingRenames temporary folder hinting at a possible reason why the files were being kept open.

OK, so that was a very rough analysis.  But things got a bit more complicated when I started looking a bit deeper.  I picked the second entry, our winlogon.exe in the Windows component store, and had a look at the handles which were opened to that.  Here’s what I found.  Remember that this is based on minute-by-minute snapshots, so this is incomplete, but still paints a curious picture:

  • 41 different handles were logged for this one file.
  • 14 of these handles were still open at the end of the session, when TrustedInstaller.exe had no further CPU or disk activity.  (TrustedInstaller.exe remains in memory until the restart, it seems)

Why does TrustedInstaller.exe need to maintain 14 open handles to a single file until Windows is restarted?  And why does it need to open at least 41 different handles over time to the same file?  This seems horribly inefficient, as the same pattern is manifest for the 2000 or more other objects that were opened by the TrustedInstaller.exe process.

I could understand a requirement to maintain a single open handle for concurrency and replacement (PendingRenames) operations.  I’m sure there’s a reason…  There may be another update to this story yet!