Category Archives: Bugs

Understanding and correcting interface reference leaks in Delphi’s Vcl.OleCtrls.pas

Update 21 Sep 2015: This bug has been fixed in Delphi 10 Seattle.

I have spent quite some time recently tracing a memory leak in a Delphi application.  It is quite a large application and makes a lot of use of embedded MSHTML (TWebBrowser and TEmbeddedWB) controls for presentation.  Somehow, somewhere, we were leaking memory: users were reporting that after a few hours of use, the application slowed down, and checking Task Manager certainly reflected excessive memory usage.

The normal procedure to reproduce the memory leak was followed, including using tools such as AQtime and other debug logging tools.  However, no leaks were detected using these tools, although we could see the memory usage increasing in Task Manager on our test machines.  This suggested the memory we were leaking was not allocated by Delphi code: i.e. it was Windows or a 3rd party DLL.  This doesn’t mean, of course, that it wasn’t our fault — just that it wasn’t allocated directly from Delphi source!

At this point, I was asked to trace this issue further.  I ran Performance Monitor with a couple of key counters: Handle Count and Working Set.  Running the test in question (involving opening and closing a window with an embedded web browser control) showed a gradual increase in both handle count and working set size.  However, Performance Monitor unfortunately does not include a user handle (i.e. window handles) counter.  It was when I noticed in Process Explorer that user handles were also increasing that I got my first break in tracing the cause.

It turned out that the embedded web browser window was not always being destroyed when its parent window was.  This window had the class name “Internet Explorer_Server”.

With a bit more tracing, I found that the trigger was the Document or Application properties.  If the Document property of the TWebBrowser control was ever referenced, the window was never destroyed (note, there are also some other properties that trigger the same behaviour — look for properties returning type IDispatch).

This set me to researching the Document property.  It looks like this:

property Document: IDispatch index 203 read GetIDispatchProp;

Looking at GetIDispatchProp in Vcl.OleCtrls.pas, we see the following code:

function TOleControl.GetIDispatchProp(Index: Integer): IDispatch;
var
  Temp: TVarData;
begin
  GetProperty(Index, Temp);
  Result := IDispatch(Temp.VDispatch);
end;

And here some alarm bells go off.  Delphi, rather nicely, manages all the reference counting on interfaces.  This works pretty smoothly, until you trick the compiler by casting other types to interfaces.  Here the code in question is triggering an invisible call to IntfCopy in the line:

Result := IDispatch(Temp.VDispatch);

The IntfCopy function internally calls  _AddRef on the object, but because of the cast to IDispatch from a Pointer, this new reference is never released.  The fix is to change the function (and the similar GetIUnknownProp function) to:

function TOleControl.GetIDispatchProp(Index: Integer): IDispatch;
var
  Temp: TVarData;
begin
  GetProperty(Index, Temp);
  Pointer(Result) := Temp.VDispatch;
end;

function TOleControl.GetIUnknownProp(Index: Integer): IUnknown;
var
  Temp: TVarData;
begin
  GetProperty(Index, Temp);
  Pointer(Result) := Temp.VUnknown;
end;

By casting this way, we avoid the call to IntfCopy and hence the call to _AddRef.  Alternatively, you could have called _Release on Result, but this would require another test to ensure that Result wasn’t nil (and also, of course, redundant _AddRef and _Release calls).

It turns out that this problem was identified way back in 1999 and the workaround has been commonly referenced since then.  So I am not taking credit for the fix here!  And yet it is still unresolved in Delphi XE2 — and so still causing trouble for Delphi programmers today!  There are no clear references to the problem in QualityCentral that I could find (that’s about to change!)

But don’t stop reading yet!

“Now my app crashes”

There are frequent complaints online that this fix results in crashes.  This is because other developers have engineered fixes to this reference leak in places where these GetIDispatchProp and/or GetIUnknownProp calls are made, rather than where the problem actually occurs.  I have found this in TEmbeddedWB.  TEmbeddedWB is a web browser hosting component that takes up where TWebBrowser leaves off, and it does fix a lot of the limitations of TWebBrowser.

But here are the places in the TEmbeddedWB source that you’ll need to “unfix” once you fix the root problem in Vcl.OleCtrls.pas:

EmbeddedWB.pas [2603]: (procedure TEmbeddedWB.SetUserAgentInt): Delete _Release call
EwbCore.pas [1367]: (procedure TCustomEmbeddedWB.SetDesignMode): Delete _Release call
EwbCore.pas [1404]: (procedure TCustomEmbeddedWB.SetDownloadOptions): Delete _Release call
EwbCore.pas [1468]: (procedure TCustomEmbeddedWB.SetUserInterfaceOptions): Delete _Release call
EwbTools.pas [1015]: (function GetBmpFromBrowser): Delete _Release call
EwbTools.pas [3164]: (function InvokeCMD): Delete _Release call

Note, you’ll also see an experimental fix buried — and disabled — in the TEmbeddedWB code (but without fixing the lines above), and without a whole lot of documentation as to why!

I have created a Quality Central report, along with a test case and example fix.  I also checked the Delphi VCL source, and about 30 other components that we use, and found no more calls to _Release relating to this issue.

Updates to kmwString functions

A couple of weeks ago I published on this blog a set of functions for handling supplementary characters in JavaScript.  We have since fixed a few bugs in the code and have now made the kmwString script available on github at https://github.com/tavultesoft/keyman-tools

This update:

  • Fixes an error which could occur when only 1 parameter was passed to kmwSubstring and that parameter was negative
  • Fixed an incorrect NaN return value for kmwCodePointToCodeUnit when passed an offset at the very end of a string
  • Adds kmwCodeUnitToCodePoint function
  • Adds kmwCharAt function — which is basically just kmwSubstr(index,1)

Again, suggestions, comments, fixes and flames all gratefully received.

Cruelly Tricked by the Library

So I borrowed a high brow British crime novel from the library. Ruth Rendell, An Unkindness of Ravens. Okay, maybe not very high brow. Here’s a picture.

It is sad that library technology has forced barcodes to be put on top of book titles or authors. 
I’m not very good at reading barcodes

Then I opened the book.

Note how classy that title is: the curlicue on the V is chopped off…

Um, what was that again? The Vampire’s Betrayal: Raven Hart? Trashy vampire romance? Twitch. Yeuch. The writing makes Twilight look sophisticated. I checked the cover again to make sure I hadn’t somehow, insanely, picked up the wrong book. But to no avail.

On page 27, the sloppy supernatural drivel miraculously transforms back into the erudite and innately British crime novel. But I am forever scarred.

I couldn’t bear to read page 27.  Who knows what I have missed so far?

I shall never again be able to open a Ruth Rendell novel without fear and trepidation.

More on the pngimage unit in Delphi XE2

After submitting the issue reported in How not to do exception classes in a Delphi library to Embarcadero (QC102796) they asked me for a test case.  Didn’t make much sense to me but I duly wrote one and am about to submit it.  However, in the process I collected a lot more information about work that should have been done on the Vcl.Imaging.pngimage unit when it was integrated into the Delphi source. Here’s the full list of the exception classes and what should be done with them.

  • EPNGError should inherit from EInvalidGraphicOperation or be eliminated as it raises inappropriate exceptions
  • EPngUnexpectedEnd should inherit from EInvalidGraphic
  • EPngInvalidCRC should inherit from EInvalidGraphic
  • EPngInvalidIHDR should inherit from EInvalidGraphic
  • EPNGMissingMultipleIDAT should inherit from EInvalidGraphic
  • EPNGZLIBError should inherit from EInvalidGraphic
  • EPNGInvalidPalette should inherit from EInvalidGraphic
  • EPNGInvalidFileHeader should inherit from EInvalidGraphic
  • EPNGInvalidFileHeader should inherit from EInvalidGraphic
  • EPNGSizeExceeds should inherit from EInvalidGraphic
  • EPNGMissingPalette is unused and should be deleted
  • EPNGUnknownCriticalChunk should inherit from EInvalidGraphic
  • EPNGUnknownCompression should inherit from EInvalidGraphic
  • EPNGUnknownInterlace should inherit from EInvalidGraphic
  • EPNGNoImageData should inherit from EInvalidGraphic
  • EPNGHeaderNotPresent should inherit from EInvalidGraphic

The following three exceptions are really reimplementations of existing exception classes and should be abolished entirely:

  • EPNGOutMemory should just really be replaced with RaiseLastOSError.  Or, if backward compatibility for this exception is really needed then inherit from EOutOfMemory
  • EPNGNotExists is not used unless USEDELPHI is not defined, which it is…
  • EPNGCouldNotLoadResource masks a number of other exceptions and should if possible be removed entirely.  The code in question re-raises an exception and buries the original error with an entirely unhelpful secondary exception (and what is that exit; doing in there?).
  try 
    ResStream := TResourceStream.Create(Instance, Name, RT_RCDATA); 
  except
    RaiseError(EPNGCouldNotLoadResource, EPNGCouldNotLoadResourceText);   
    exit; 
  end;

Finally, these three need to inherit from EInvalidGraphicOperation:

  • EPNGCannotChangeTransparent should inherit from EInvalidGraphicOperation
  • EPNGInvalidNewSize should inherit from EInvalidGraphicOperation
  • EPNGInvalidSpec should inherit from EInvalidGraphicOperation

In a number of locations EPNGError and Exception are also raised (albeit sometimes in code that is $ifdef‘d out of use), often with blank error messages!  For example:

    case BitDepth of 
      {Only supported by COLOR_PALETTE}
      1: DetectPixelFormat := pf1bit; 
      2, 4: DetectPixelFormat := pf4bit; 
      {8 may be palette or r, g, b values} 
      8, 16: 
        case ColorType of 
          COLOR_RGB, COLOR_GRAYSCALE: DetectPixelFormat := pf24bit; 
          COLOR_PALETTE: DetectPixelFormat := pf8bit; 
          else raise Exception.Create('');
        end {case ColorFormat of} 
        else raise Exception.Create('');
    end {case BitDepth of}

Finally, the unit reimplements a bunch of standard Delphi VCL functionality that is inappropriate to keep within the Delphi VCL codebase, including core classes such as TStream.  While these are not compiled in because they surrounded by $ifdef statements, all this code should really be stripped out for maintenance and code cleanliness/readability reasons.  Thus, all code surrounded by $ifndef UseDelphi should be deleted. This is not a systematic review of the issues in the Vcl.Imaging.pngimage unit; it’s more just a rant and giving a heads up about the quality of this unit.  Embarcadero, please clean this unit up!

Further analysis on the TrustedInstaller.exe memory leaks

You may have read my last 2 blog posts about the TrustedInstaller.exe leak that seems to be present in Windows 7 (Does Microsoft’s TrustedInstaller.exe have a leak? and Reproducing the TrustedInstaller.exe leak).  It was not yet clear to me whether this leak that I observed was a bug or if it is by design, so I decided that the next time I had to build a Windows VM and install updates, I would capture some more log information about the update.  I thought about using Process Monitor to capture everything, and this would certainly have captured reams of very interesting data, but this would have slowed the update install process down unacceptably on the day.

So instead, I ran the Handle program, also a part of the SysInternals suite, once a minute, and dumped its output to a text file.  The update process ran as normal, and the handles logging generated a 50mb text file, which was fairly simple to import into a database.  Now I had some data I could play with!

I picked a filename at random to look at in the logs.  I closed my eyes and pointed, and the lucky filename chosen was winlogon!  A simple search in the database came up with 4 different objects matching winlogon in the handle list:

C:\Windows\winsxs\amd64_microsoft-windows-winlogon_31bf3856ad364e35_6.1.7600.16447_none_cbe534e7ee8042ad

C:\Windows\winsxs\amd64_microsoft-windows-winlogon_31bf3856ad364e35_6.1.7600.16447_none_cbe534e7ee8042ad\winlogon.exe

C:\Windows\winsxs\Temp\PendingRenames\0859c097e7b9cc0153330000d0088004.amd64_microsoft-windows-winlogon_31bf3856ad364e35_6.1.7600.16447_none_cbe534e7ee8042ad_winlogon.exe_ac37d0c5

C:\Windows\winsxs\Temp\PendingRenames\aef6bd97e7b9cc0152330000d0088004.amd64_microsoft-windows-winlogon_31bf3856ad364e35_6.1.7600.16447_none_cbe534e7ee8042ad.manifest

The first entry in this list was a directory, in the Windows component store (aka Side-by-Side Store).  The second was a reference to winlogon.exe within this same folder.  The third and fourth entries gave a bit of a clue as to what may be going on, with the PendingRenames temporary folder hinting at a possible reason why the files were being kept open.

OK, so that was a very rough analysis.  But things got a bit more complicated when I started looking a bit deeper.  I picked the second entry, our winlogon.exe in the Windows component store, and had a look at the handles which were opened to that.  Here’s what I found.  Remember that this is based on minute-by-minute snapshots, so this is incomplete, but still paints a curious picture:

  • 41 different handles were logged for this one file.
  • 14 of these handles were still open at the end of the session, when TrustedInstaller.exe had no further CPU or disk activity.  (TrustedInstaller.exe remains in memory until the restart, it seems)

Why does TrustedInstaller.exe need to maintain 14 open handles to a single file until Windows is restarted?  And why does it need to open at least 41 different handles over time to the same file?  This seems horribly inefficient, as the same pattern is manifest for the 2000 or more other objects that were opened by the TrustedInstaller.exe process.

I could understand a requirement to maintain a single open handle for concurrency and replacement (PendingRenames) operations.  I’m sure there’s a reason…  There may be another update to this story yet!

Using out parameters in Delphi SOAP transactions

So here’s today’s little gotcha: TRemotable out parameters in Delphi SOAP interface function calls must always be initialised before the call.  If you don’t do this, at some point your client application will crash with an access violation in TValue.Make (System.Rtti.pas).

It turns out that when the TRIO component parses the Run Time Type Information (RTTI), it conflates the var and out modifiers on the parameters, and treats out parameters as var parameters (see function TRttiMethod.GetInvokeInfo).  This means that when the function is called, the TValue.Make function which collects RTTI for the passed parameters will assume that a non-nil object reference is always pointing to a valid and initialised object, even for an out parameter, and will dereference it to get additional RTTI.

The workaround is simple: just initialise all out parameters to nil before the function call.  This is good practice anyway.

I think the proper fix in the Delphi library code would be to treat out and var parameters differently in the SOAP calls, and always clear out parameters to nil within the SOAP framework.

Here’s some example code to reproduce the problem:

unit TestOutParams;

interface

uses
  Winapi.Windows, Winapi.Messages, System.SysUtils, System.Variants, System.Classes, Vcl.Graphics,
  Vcl.Controls, Vcl.Forms, Vcl.Dialogs, Vcl.StdCtrls,
  Soap.SoapHTTPClient, Soap.InvokeRegistry;

type
  TForm1 = class(TForm)
    Button1: TButton;
    procedure Button1Click(Sender: TObject);
  end;

type
  ITestOutParam = interface(IInvokable)
    ['{343E9171-4300-4523-A926-4904EDD652E1}']
    function TestOutParam(out p5: TSOAPAttachment): Boolean; stdcall;
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

function GetITestOutParam: ITestOutParam;
var
  RIO: THTTPRIO;
begin
  Result := nil;
  RIO := THTTPRIO.Create(nil);
  try
    Result := (RIO as ITestOutParam);
  finally
    if (Result = nil) then
      RIO.Free;
  end;
end;

procedure TForm1.Button1Click(Sender: TObject);
var
  FSOAP: ITestOutParam;
  FOutputParameter: TSOAPAttachment;
begin
  FOutputParameter := Pointer(8); // simulate uninitialized local variable pointing to protected non-nil address
  FSOAP := GetITestOutParam;
  Assert(Assigned(FSOAP));
  FSOAP.TestOutParam(FOutputParameter); // this will crash
end;

initialization
  InvRegistry.RegisterInterface(TypeInfo(ITestOutParam));
end.

Weird filename globbing in Windows

Can anyone explain this result?  (note: folder name has been obscured)

D:\...\exe\tds>dir *337*
 Volume in drive D has no label.
Volume Serial Number is 4279-DECE

Directory of D:\...\exe\tds

05/12/2011  02:39 PM        12,432,384 tds-8.0.324.0.exe
11/04/2011  02:41 PM        13,268,480 tds-8.0.337.0.exe
2 File(s)     25,700,864 bytes
0 Dir(s)  129,676,939,264 bytes free
D:\...\exe\tds>dir *337.0.exe
Volume in drive D has no label.
Volume Serial Number is 4279-DECE

Directory of D:\...\exe\tds

11/04/2011  02:41 PM        13,268,480 tds-8.0.337.0.exe
1 File(s)     13,268,480 bytes
0 Dir(s)  129,676,324,864 bytes free

I can’t figure out why tds-8.0.324.0.exe is matching *337*, in this folder.  There are lots of other files in the folder with similar names that don’t match.  chkdsk returned no errors.  Any ideas?

Update 4:45pm: I did a little more investigation.  The error can be reproduced on other computers, so it is not related to the state of the filesystem, or the path name.  The following command narrowed down the match a little:

c:\temp\tds>dir *d337*
Volume in drive C is OS
Volume Serial Number is 8036-7CEB

Directory of c:\temp\tds

12/05/2011  02:39 PM        12,432,384 tds-8.0.324.0.exe
1 File(s)     12,432,384 bytes
0 Dir(s)  12,700,200,960 bytes free

But the following did not match:

c:\temp\tds>dir td337*
Volume in drive C is OS
Volume Serial Number is 8036-7CEB

Directory of c:\temp\tds

File Not Found

Curiouser and curiouser.  Procmon did not provide much insight into the issue:

procmon

Where to now?  It’s Sunday afternoon, and I don’t feel like breaking out a kernel debugger to trace that any further right now.

Update 7:30pm: Well, I was puzzled so I did a little more research.  And ran across a mention of 8.3 filenames.  All of a sudden everything clicked into place.

D:\...\exe\tds>dir /x *337*
Volume in drive D has no label.
Volume Serial Number is 4279-DECE

Directory of D:\...\exe\tds

05/12/2011  02:39 PM        12,432,384 TDD337~1.EXE tds-8.0.324.0.exe
11/04/2011  02:41 PM        13,268,480 TDD938~1.EXE tds-8.0.337.0.exe
2 File(s)     25,700,864 bytes
0 Dir(s)  129,670,885,376 bytes free

Yes, even today DOS comes back to bite us.  So just beware when doing wildcard matches — especially with that old favourite del.

Delphi’s ongoing problem with "with"

Delphi has long included a scoping construct called with which can be used to increase the readability and efficiency of code.  From Delphi’s documentation:

A with statement is a shorthand for referencing the fields of a record or the fields, properties, and methods of an object. The syntax of a with statement is:
  with obj do statement
or:
  with obj1, …, objn do statement
where obj is an expression yielding a reference to a record, object instance, class instance, interface or class type (metaclass) instance, and statement is any simple or structured statement. Within the statement, you can refer to fields, properties, and methods of obj using their identifiers alone, that is, without qualifiers.

However, with has a really big gotcha: scope ambiguity.  Scope ambiguity came back to bite us yet again today: I am currently upgrading a major project (over 1 million SLoC) from an older version of Delphi to Delphi XE2. One component of this project is a legacy third party component package which is no longer supported.  We have full source and the cost of replacing it would be too high, so we are patching the source where needed.

While tracing a reported window sizing bug, I found the following line of code (on line 12880, yes it’s a big source file):

with vprGetTranspBorderSize(BorderStyle) do
  R := Rect(Left,Top,Width-Right,Height-Bottom);

vprGetTranspBorderSize returns a TRect.  In the earlier version of Delphi, the Width and Height properties were not members of TRect, so were found in the parent scope, being the component itself in this case.  All good.

But now in Delphi XE2, records can now have functions and properties just like a class.  So TRect has a bunch of additional properties, including Width and Height.  Handy to have, until you throw in some code that was written before these new properties existed.  One fix here is simply to qualify the scope:

with vprGetTranspBorderSize(BorderStyle) do
  R := Rect(Left,Top,Self.Width-Right,Self.Height-Bottom);

Or, you could declare a second variable, and eliminate the with statement entirely:

R2 := vprGetTranspBorderSize(BorderStyle);
R := Rect(R2.Left,R2.Top,Width-R2.Right,Height-R2.Bottom);

The problem with the second approach is it makes the code less readable and the introduction of the second variable R2 widens its scope to the whole function, rather than just the block where its needed.  This tends to lead to reuse of variables, which is a frequent source of bugs.

Many developers (including myself) have argued that adding support for aliases to with would avoid these scope problems.  Some others argue that one simply shouldn’t use with, and instead declare variables.  But with does make code much cleaner and easier to read, especially when used with try/finally blocks.

with R2 := vprGetTranspBorderSize(BorderStyle) do
  R := Rect(R2.Left,R2.Top,Width-R2.Right,Height-R2.Bottom);

Given that this issue was reported to Borland as far back as 2002, and since that time many, many complex language constructs have been added to Delphi, I think it’s a real crying shame that Borland/Inprise/CodeGear/Embarcadero have chosen to ignore this problem for so long.

IXMLDocument.SaveToStream does not always use UTF-16 encoding

Delphi’s documentation on IXMLDocument.SaveToStream has the following important caveat:

Regardless of the encoding system of the original XML document, SaveToStream always saves the stream in UTF-16.

It’s helpful to have notes like this.  Mind you, forcing UTF-16 output is definitely horrible; what if we need our document in UTF-8 or (God forbid) some non-Unicode encoding?

Now Kris and I were looking at a Unicode corruption issue with an XML document in a Delphi application and struggling to understand what was going wrong given this statement in the documentation.  Our results didn’t add up, so we wrote a little test app to test that statement:

procedure TForm1.OutputEncodingIsUTF8;
const
  UTF8XMLDoc: string =
    '<!--?xml version="1.0" encoding="utf-8"?-->'#13#10+
    '';
var
  XMLDocument: IXMLDocument;
  InStream: TStringStream;
  OutStream: TFileStream;
begin
  // stream format is UTF8, input string is converted to UTF8
  // and saved to the stream
  InStream := TStringStream.Create(UTF8XMLDoc, TEncoding.UTF8);</blockquote>
  // we'll write to this output file
  OutStream := TFileStream.Create('file_should_be_utf16_but_is_utf8.xml',
    fmCreate);
  try
    XMLDocument := TXMLDocument.Create(nil);
    XMLDocument.LoadFromStream(InStream);
    XMLDocument.SaveToStream(OutStream);
    // IXMLDocument.SaveToStream docs state will always be UTF-16
  finally
    FreeAndNil(InStream);
    FreeAndNil(OutStream);
  end;

  with TStringList.Create do
  try    // we want to load it as a UTF16 doc given the documentation
    LoadFromFile('file_should_be_utf16_but_is_utf8.xml', TEncoding.Unicode);
    ShowMessage('This should be displayed as an XML document '+
                'but instead is corrupted: '+#13#10+Text);
  finally
    Free;
  end;
end;

When I run this, I’m expecting the following dialog:

But instead I get the following dialog:

Note, this is run on Delphi 2010.  Haven’t run this test on Delphi XE2, but the documentation hasn’t changed.

The moral of the story is, the output encoding is the same as the input encoding, unless you change the output encoding with the Encoding property, for example, adding the highlighted line below fixes the code sample:

    XMLDocument := TXMLDocument.Create(nil);
    XMLDocument.LoadFromStream(InStream);
    XMLDocument.Encoding := 'UTF-16';
    XMLDocument.SaveToStream(OutStream);

The same documentation issue exists for TXMLDocument.SaveToStream.  I’ve reported the issue in QualityCentral.

The case of the terribly slow PayPal emails

Every now and then I receive a payment via PayPal.  That’s not unusual, right?  PayPal would send me an email notifying me of the payment, and I’d open up Outlook to take a look at it.  All well and good.  In the last week, however, something changed.  When I clicked on any PayPal email, Outlook would take the best part of a minute to open the email, and what’s more, would freeze its user interface entirely while doing this.  But this was only happening with emails from PayPal — everything else was fine.

Not good.  At first I suspected an addin was mucking things up, so I disabled all the Outlook addins and restarted Outlook for good measure.  No difference.  Now I was getting worried — what if this was some unintended side-effect of some malware that had somehow got onto my machine, and it was targeting PayPal emails?

So I decided to do some research.  I fired up SysInternals’ Process Monitor, set it up to show only Outlook in its filtering, and turned on the Process Monitor trace.

Process Monitor filter window – filtering for OUTLOOK.EXE

Then I went and clicked on a PayPal email.  Waited the requisite time for the email to display, then went back to Process Monitor and turned off the trace.  I added the Duration column to make it easier to spot an anomalous entry.  This doesn’t always help but given the long delay, I was expecting some file or network activity to be taking a long time to run.

Adding the Duration column

Then scrolling up the log I quickly spotted the following entry.  It had a duration of nearly 3 seconds which stood out like a sore thumb.

The first offending entry

This entry was a Windows Networking connection to connect to a share on the remote host \\102.112.2o7.net.  This came back, nearly 3 seconds later, with ACCESS DENIED.  Then there were a bunch of follow up entries that related to this, all in all taking over 30 seconds to complete.  A quick web search revealed that this domain with its dubious looking name 102.112.2o7.net belongs to a well known web statistics company called Omniture.  That took some of the load off my mind, but now I was wondering how on earth opening a PayPal email could result in Internet access when I didn’t have automatic downloads of pictures switched on.

One of the emails that caused the problem, redacted of course 🙂

I opened the source of the PayPal HTML email and searched for “102.112“.  And there it was.

The HTML email in notepad

That’s a classic web bug.  Retrieving that image of a 1×1 pixel size, no doubt transparent, with some details encoded in the URL to record my visit to the web page (or in this case, opening of the email):

What was curious about this web bug was the use of the “//” shorthand to imply the same protocol (e.g. HTTP or HTTPS) as the base page.  That’s all well and good in a web browser, but in Outlook, the email is not being retrieved over HTTP.  So Outlook interprets this as a Windows Networking address, and attempts a Windows Networking connection to the host instead, \\102.112.2o7.net\b….

At this point I realised this could be viewed as a security flaw in Outlook.  So I wrote to Microsoft instead of publishing this post immediately.  Microsoft responded that they did not view this as a vulnerability (as it does not result in system compromise), but that they’d pass it on to the Outlook team as a bug.

Nevertheless, this definitely has the potential of being exploited for tracking purposes.  One important reason that external images are not loaded by default is to prevent third parties from being notified that you are reading a particular email.  While this little issue does not actually cause the image to load (it is still classified as an external image), it does cause a network connection to the third party server which could easily be logged for tracking purposes.  This network connection should not be happening.

So what was my fix?  Well, I don’t really care about Omniture or whether PayPal get their statistics, so I added an entry in my hosts file to block this domain.  Using an invalid IP address made it fail faster than the traditional use of 127.0.0.1:

0.0.0.0    102.112.2o7.net

And now my PayPal emails open quickly again.