Update 21 Sep 2015: This bug has been fixed in Delphi 10 Seattle.
I have spent quite some time recently tracing a memory leak in a Delphi application. It is quite a large application and makes a lot of use of embedded MSHTML (TWebBrowser and TEmbeddedWB) controls for presentation. Somehow, somewhere, we were leaking memory: users were reporting that after a few hours of use, the application slowed down, and checking Task Manager certainly reflected excessive memory usage.
The normal procedure to reproduce the memory leak was followed, including using tools such as AQtime and other debug logging tools. However, no leaks were detected using these tools, although we could see the memory usage increasing in Task Manager on our test machines. This suggested the memory we were leaking was not allocated by Delphi code: i.e. it was Windows or a 3rd party DLL. This doesn’t mean, of course, that it wasn’t our fault — just that it wasn’t allocated directly from Delphi source!
At this point, I was asked to trace this issue further. I ran Performance Monitor with a couple of key counters: Handle Count and Working Set. Running the test in question (involving opening and closing a window with an embedded web browser control) showed a gradual increase in both handle count and working set size. However, Performance Monitor unfortunately does not include a user handle (i.e. window handles) counter. It was when I noticed in Process Explorer that user handles were also increasing that I got my first break in tracing the cause.
It turned out that the embedded web browser window was not always being destroyed when its parent window was. This window had the class name “Internet Explorer_Server”.
With a bit more tracing, I found that the trigger was the Document or Application properties. If the Document property of the TWebBrowser control was ever referenced, the window was never destroyed (note, there are also some other properties that trigger the same behaviour — look for properties returning type IDispatch).
This set me to researching the Document property. It looks like this:
property Document: IDispatch index 203 read GetIDispatchProp;
Looking at GetIDispatchProp in Vcl.OleCtrls.pas, we see the following code:
function TOleControl.GetIDispatchProp(Index: Integer): IDispatch; var Temp: TVarData; begin GetProperty(Index, Temp); Result := IDispatch(Temp.VDispatch); end;
And here some alarm bells go off. Delphi, rather nicely, manages all the reference counting on interfaces. This works pretty smoothly, until you trick the compiler by casting other types to interfaces. Here the code in question is triggering an invisible call to IntfCopy in the line:
Result := IDispatch(Temp.VDispatch);
The IntfCopy function internally calls _AddRef on the object, but because of the cast to IDispatch from a Pointer, this new reference is never released. The fix is to change the function (and the similar GetIUnknownProp function) to:
function TOleControl.GetIDispatchProp(Index: Integer): IDispatch; var Temp: TVarData; begin GetProperty(Index, Temp); Pointer(Result) := Temp.VDispatch; end; function TOleControl.GetIUnknownProp(Index: Integer): IUnknown; var Temp: TVarData; begin GetProperty(Index, Temp); Pointer(Result) := Temp.VUnknown; end;
By casting this way, we avoid the call to IntfCopy and hence the call to _AddRef. Alternatively, you could have called _Release on Result, but this would require another test to ensure that Result wasn’t nil (and also, of course, redundant _AddRef and _Release calls).
It turns out that this problem was identified way back in 1999 and the workaround has been commonly referenced since then. So I am not taking credit for the fix here! And yet it is still unresolved in Delphi XE2 — and so still causing trouble for Delphi programmers today! There are no clear references to the problem in QualityCentral that I could find (that’s about to change!)
But don’t stop reading yet!
“Now my app crashes”
There are frequent complaints online that this fix results in crashes. This is because other developers have engineered fixes to this reference leak in places where these GetIDispatchProp and/or GetIUnknownProp calls are made, rather than where the problem actually occurs. I have found this in TEmbeddedWB. TEmbeddedWB is a web browser hosting component that takes up where TWebBrowser leaves off, and it does fix a lot of the limitations of TWebBrowser.
But here are the places in the TEmbeddedWB source that you’ll need to “unfix” once you fix the root problem in Vcl.OleCtrls.pas:
EmbeddedWB.pas : (procedure TEmbeddedWB.SetUserAgentInt): Delete _Release call
EwbCore.pas : (procedure TCustomEmbeddedWB.SetDesignMode): Delete _Release call
EwbCore.pas : (procedure TCustomEmbeddedWB.SetDownloadOptions): Delete _Release call
EwbCore.pas : (procedure TCustomEmbeddedWB.SetUserInterfaceOptions): Delete _Release call
EwbTools.pas : (function GetBmpFromBrowser): Delete _Release call
EwbTools.pas : (function InvokeCMD): Delete _Release call
Note, you’ll also see an experimental fix buried — and disabled — in the TEmbeddedWB code (but without fixing the lines above), and without a whole lot of documentation as to why!
I have created a Quality Central report, along with a test case and example fix. I also checked the Delphi VCL source, and about 30 other components that we use, and found no more calls to _Release relating to this issue.