Category Archives: Development

Why does switching languages hang my app?

This is a very short follow-up to yesterday’s blog about WM_INPUTLANGCHANGEREQUEST, WaitForSingleObject and applications hanging.

While I thought, yesterday, that this bug was no longer such a problem on Windows Vista and Windows 7, it turns out that if you assign a hotkey to a language (e.g. Alt+Shift+2), the KLF_SETFORPROCESS flag will be set when that hotkey is pressed.  This means that this issue is still a problem today.

This is not just a problem for Delphi.  I’ve found references to Outlook hanging, .NET 4.5 WPF applications hanging, and VC++ applications hanging.

Moral of the story (again): thread-safe code is hard.  Don’t believe that your code suddenly becomes thread-safe because you synchronize it to the main thread (no matter what the Delphi documentation tells you!)

WaitForSingleObject. Why you should never use it.

Updated 30/1/2013: Fixed a typo and formatting issue or two.  Added a sentence for clarity.  See also the short follow-up post.

I’ve really struggled with how to frame this post.  It could be about the dangers of WaitForSingleObject and WaitForMultipleObjects.  Or about how Delphi’s TThread.Synchronize seems so handy, and yet because it must use WaitForSingleObject, is so fraught with complications.  Or yet, about how pressing Alt+Left Shift to switch languages could hang an application.  In the end, it’s about all three of these things.

Some people, when confronted with a problem, think, “I know, I’ll use threads,” and then two they hav erpoblesms.

Let’s start with TThread.Synchronize.  It seems a handy little function: it will, in a thread-safe manner, call a procedure in the context of the main VCL thread (the thread that owns the Delphi VCL GUI windows), and wait, using the Windows API WaitForSingleObject(INFINITE), for the procedure to return.  Simpler than fiddling with synchronisation primitives, right?  Everyone knows threading is hard, so use the well-tested thread utilities where you can?

Except that WaitForSingleObject and its big brother WaitForMultipleObjects are dangerous.  The basic problem is that these calls can cause deadlocks, if you ever call them from a thread that has its own message loop and windows.  That’s okay, you say, I don’t have any UI except in my main thread.  But any thread that uses COM can have hidden COM helper windows when doing RPC (and see below for more on this).  And other libraries can create their own windows as well, such as the ADO libraries.

So what causes the deadlock? Well, I’ll illustrate with the scenario we ran into.  Some old code (that I wrote, okay, okay) created a thread (we’ll call it BackgroundThread).  BackgroundThread used TThread.Synchronize to periodically update the UI status about a background database process it was running.  It doesn’t really matter what it was doing, but the use of Microsoft’s ADO database library meant that this thread was creating a hidden window, with the class ADODB.AsyncEventMessenger.  Behind the scenes, a second window was automatically created by Windows once we had the first window in the thread, and this one had the class name IME.

Every now and then, our BackgroundThread would call Synchronize(RefreshStatus).  This would signal an event which the main thread would check periodically from its message loop.  Eventually it would call the RefreshStatus procedure.  BackgroundThread would in the meantime have called WaitForSingleObject(INFINITE) to wait for an event to be signaled by the main thread indicating that the RefreshStatus procedure had finished.

Where are we going?  Well, if the main thread receives a message that it then decides to send on to other windows in the process, while BackgroundThread is getting ready to synchronize, we can end up in a deadlock.  And, it turns out that in Windows XP, this can happen when WM_INPUTLANGCHANGEREQUEST (0x50) is received, e.g. when the user presses Alt+Left Shift.  (Note, for this scenario to play, the Input Method Manager must be enabled — install Far Eastern language support in Windows XP).  Remember, this is just one possible scenario which can trigger a deadlock.

Let’s pull this apart.  I’ve loaded the stalled process into WinDbg, and am now looking at two call stacks which have deadlocks, in user mode.  First the main VCL GUI thread:

   0  Id: bf8.cb8 Suspend: 1 Teb: 7ffde000 Unfrozen
ChildEBP RetAddr
0012fadc 7e4194be ntdll!KiFastSystemCallRet
0012fb30 7e43652f USER32!NtUserMessageCall+0xc
0012fb50 7e418734 USER32!EditWndProcW+0x5d
0012fb7c 7e418816 USER32!InternalCallWinProc+0x28
0012fbe4 7e42a013 USER32!UserCallWinProcCheckWow+0x150
0012fc14 7e42a039 USER32!CallWindowProcAorW+0x98
0012fc34 004c0e7d USER32!CallWindowProcW+0x1b
0012fda4 004c0d80 audit4_home!Vcl.Controls.TWinControl.DefaultHandler+0xdd
0012fdf0 004c03d3 audit4_home!Vcl.Controls.TWinControl.WndProc+0x5b8
0012fe20 00467b3e audit4_home!Vcl.Controls.TWinControl.MainWndProc+0x2f
0012fe38 7e418734 audit4_home!System.ClassesStdWndProc+0x16
0012fe64 7e418816 USER32!InternalCallWinProc+0x28
0012fecc 7e4189cd USER32!UserCallWinProcCheckWow+0x150
0012ff2c 7e418a10 USER32!DispatchMessageWorker+0x306
0012ff3c 005a6980 USER32!DispatchMessageW+0xf
0012ff58 005a69c3 audit4_home!Vcl.Forms.TApplication.ProcessMessage+0xf8
0012ff7c 00d26d60 audit4_home!Vcl.Forms.TApplication.HandleMessage+0xf
0012ff9c 016e6903 audit4_home!S4s.Ui.Session.Appsession_main.TAppSession_Main.Run+0xcc
0012ffc0 7c817077 audit4_home!Audit4_home.initialization+0xc3
0012fff0 00000000 kernel32!BaseProcessStart+0x23

And then our BackgroundThread:

   9  Id: bf8.f30 Suspend: 1 Teb: 7ffd5000 Unfrozen
ChildEBP RetAddr
0494fdb8 7c90df5a ntdll!KiFastSystemCallRet
0494fdbc 7c8025db ntdll!NtWaitForSingleObject+0xc
0494fe20 7c802542 kernel32!WaitForSingleObjectEx+0xa8
0494fe34 0042d4b3 kernel32!WaitForSingleObject+0x12
0494fe80 00408a19 audit4_home!System.Sysutils.WaitForSyncWaitObj+0x7
0494fed0 004654f2 audit4_home!System.TMonitor.Wait+0x25
0494fedc 00c9f3a2 audit4_home!System.Classes.TThread.Synchronize+0x2e
0494fef8 00c9ef29 audit4_home!S4s.Br.Backgroundclasses.Backgroundclass.TBackgroundClass.SetStatus+0x8a
0494ff70 00464b11 audit4_home!S4s.Br.Backgroundclasses.Backgroundclass.TBackgroundClass.Execute+0x18d
0494ffa0 00409752 audit4_home!System.Classes.ThreadProc+0x45
0494ffb4 7c80b729 audit4_home!SystemThreadWrapper+0x2a
0494ffec 00000000 kernel32!BaseThreadStart+0x37

We can see that the main thread has sent a message somewhere.  It turns out it has sent a message to a window in the same thread (The window handle 0006040c is just the Edit window):

0012fb30 7e43652f 0006040c 00000050 00000001 USER32!NtUserMessageCall+0xc
0012fb50 7e418734 0006040c 00000050 00000001 USER32!EditWndProcW+0x5d

So why is it stalling?  It’s hard to determine here, because everything bad is happening in kernel mode, behind that KiFastSystemCallRet call. That must mean it’s time to step into kernel mode!  In WinDbg, press Ctrl+K, select Local.  I’m learning here, so this is as much for my own documentation as to explain to anyone else (in other words, if it doesn’t make sense to you, it won’t make sense to me, either, in 6 weeks time).  First, we find the process details:

lkd> !process 0 0 audit4_home.exe
PROCESS 894c3710  SessionId: 0  Cid: 0bf8    Peb: 7ffdf000  ParentCid: 0e0c
DirBase: 0a5c0700  ObjectTable: 00000000  HandleCount:   0.
Image: audit4_home.exe

PROCESS 893277e0  SessionId: 0  Cid: 0d94    Peb: 7ffd8000  ParentCid: 0e0c
DirBase: 0a5c0720  ObjectTable: 00000000  HandleCount:   0.
Image: audit4_home.exe

PROCESS 8950d9a8  SessionId: 0  Cid: 0f10    Peb: 7ffdb000  ParentCid: 0e0c
DirBase: 0a5c0740  ObjectTable: 00000000  HandleCount:   0.
Image: audit4_home.exe

PROCESS 89566758  SessionId: 0  Cid: 0520    Peb: 7ffde000  ParentCid: 0e0c
DirBase: 0a5c0840  ObjectTable: e127a330  HandleCount: 467.
Image: audit4_home.exe

We’ll look at the last process listed, being the one with the problem (the other three are defunct):

lkd> !process 89566758 2
PROCESS 89566758  SessionId: 0  Cid: 0520    Peb: 7ffde000  ParentCid: 0e0c
DirBase: 0a5c0840  ObjectTable: e127a330  HandleCount: 467.
Image: audit4_home.exe

THREAD 8953a920  Cid 0520.009c  Teb: 7ffdd000 Win32Thread: e1210eb0 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
8953aabc  Semaphore Limit 0x2

THREAD 8963bda0  Cid 0520.0a98  Teb: 7ffd9000 Win32Thread: 00000000 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
8963bf3c  Semaphore Limit 0x2

THREAD 8953b9a0  Cid 0520.03c8  Teb: 7ffd8000 Win32Thread: 00000000 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
8953bb3c  Semaphore Limit 0x2

THREAD 895a2c08  Cid 0520.021c  Teb: 7ffd7000 Win32Thread: 00000000 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
895a2da4  Semaphore Limit 0x2

THREAD 898f8cc8  Cid 0520.0bd8  Teb: 7ffd4000 Win32Thread: e2d75870 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
898f8e64  Semaphore Limit 0x2

THREAD 893287f8  Cid 0520.0eec  Teb: 7ff4f000 Win32Thread: 00000000 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
89328994  Semaphore Limit 0x2

THREAD 89308c30  Cid 0520.00f0  Teb: 7ff4e000 Win32Thread: 00000000 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
89308dcc  Semaphore Limit 0x2

THREAD 89ab4020  Cid 0520.01b0  Teb: 7ff4d000 Win32Thread: e2ed4260 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
89ab41bc  Semaphore Limit 0x2

THREAD 893269b0  Cid 0520.0830  Teb: 7ff4c000 Win32Thread: e23bf2c0 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
89326b4c  Semaphore Limit 0x2

THREAD 89895778  Cid 0520.0ce0  Teb: 7ff4b000 Win32Thread: e2fa06e8 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
89895914  Semaphore Limit 0x2

THREAD 8933ba80  Cid 0520.0e08  Teb: 7ff4a000 Win32Thread: 00000000 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
8933bc1c  Semaphore Limit 0x2

THREAD 89322728  Cid 0520.0f28  Teb: 7ff49000 Win32Thread: 00000000 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
893228c4  Semaphore Limit 0x2

THREAD 89449960  Cid 0520.0ab0  Teb: 7ff48000 Win32Thread: e12d5590 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
89449afc  Semaphore Limit 0x2

THREAD 8952d020  Cid 0520.09dc  Teb: 7ffdc000 Win32Thread: 00000000 WAIT: (Executive) KernelMode Non-Alertable
SuspendCount 1
b8c197d4  SynchronizationEvent

And we can see the first thread has the address 8953a920.  So let’s look at that (flag 16 means show the full stack in the process context, with parameters).

lkd> !thread 8953a920 16
THREAD 8953a920  Cid 0520.009c  Teb: 7ffdd000 Win32Thread: e1210eb0 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
8953aabc  Semaphore Limit 0x2
Not impersonating
DeviceMap                 e1050c08
Owning Process            0       Image:
Attached Process          89566758       Image:         audit4_home.exe
Wait Start TickCount      213083         Ticks: 67800 (0:00:17:39.375)
Context Switch Count      20200  NoStackSwap    LargeStack
UserTime                  00:00:00.781
KernelTime                00:00:01.484
Win32 Start Address audit4_home!Audit4_home.initialization (0x016e6840)
Start Address kernel32!BaseProcessStartThunk (0x7c810705)
Stack Init b7c1f000 Current b7c1e840 Base b7c1f000 Limit b7c19000 Call 0
Priority 10 BasePriority 8 PriorityDecrement 0 DecrementCount 16
ChildEBP RetAddr  Args to Child
b7c1e858 80503864 8953a990 8953a920 804fb094 nt!KiSwapContext+0x2f (FPO: [Uses EBP] [0,0,4])
b7c1e864 804fb094 8953aa8c 8953a920 8953a954 nt!KiSwapThread+0x8a (FPO: [0,0,0])
b7c1e88c 80502fa0 00000000 00000005 00000000 nt!KeWaitForSingleObject+0x1c2 (FPO: [5,5,4])
b7c1e8a4 804ff8e0 00000000 00000000 00000000 nt!KiSuspendThread+0x18 (FPO: [3,0,0])
b7c1e8ec 80503882 00000000 00000000 00000000 nt!KiDeliverApc+0x124 (FPO: [3,10,0])
b7c1e904 804fb094 00000240 e1210eb0 00000000 nt!KiSwapThread+0xa8 (FPO: [0,0,0])
b7c1e92c bf802f15 00000000 0000000d 00000001 nt!KeWaitForSingleObject+0x1c2 (FPO: [5,5,4])
b7c1e968 bf835eb7 00000200 00000000 00000000 win32k!xxxSleepThread+0x192 (FPO: [3,5,4])
b7c1ea04 bf8141d2 bbe83720 00000287 00000019 win32k!xxxInterSendMsgEx+0x7f6 (FPO: [Non-Fpo])
b7c1ea50 bf80ecd9 bbe83720 00000287 00000019 win32k!xxxSendMessageTimeout+0x11f (FPO: [7,7,0])
b7c1ea74 bf92b42e bbe83720 00000287 00000019 win32k!xxxSendMessage+0x1b (FPO: [4,0,0])
b7c1eaa4 bf92c675 e2d75870 e13f10e0 e13f10e0 win32k!xxxImmActivateLayout+0x5b (FPO: [2,3,4])
b7c1ec08 bf8696d9 00000004 00000000 e13f10e0 win32k!xxxImmActivateThreadsLayout+0x10c (FPO: [3,82,4])
b7c1ec48 bf86862f e13f10e0 00000100 bbe88228 win32k!xxxInternalActivateKeyboardLayout+0xb7 (FPO: [3,8,4])
b7c1ec70 bf80b5a7 8989a2d0 04090c09 00000100 win32k!xxxActivateKeyboardLayout+0x4c (FPO: [4,3,0])
b7c1ecd4 bf80ec9f bbe88228 00000050 00000001 win32k!xxxRealDefWindowProc+0x56d (FPO: [4,16,0])
b7c1ecec bf81c176 bbe88228 00000050 00000001 win32k!xxxWrapRealDefWindowProc+0x16 (FPO: [5,0,0])
b7c1ed08 bf80eee6 bbe88228 00000050 00000001 win32k!NtUserfnNCDESTROY+0x27 (FPO: [7,0,0])
b7c1ed40 8054168c 0006040c 00000050 00000001 win32k!NtUserMessageCall+0xae (FPO: [7,3,0])
b7c1ed40 7c90e514 0006040c 00000050 00000001 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ b7c1ed64)
0012fadc 7e4194be 7e428e0d 0006040c 00000050 ntdll!KiFastSystemCallRet (FPO: [0,0,0])
0012fb30 7e43652f 0006040c 00000050 00000001 USER32!NtUserMessageCall+0xc
0012fb50 7e418734 0006040c 00000050 00000001 USER32!EditWndProcW+0x5d (FPO: [4,0,4])
0012fb7c 7e418816 7e4364cf 0006040c 00000050 USER32!InternalCallWinProc+0x28
0012fbe4 7e42a013 00000000 7e4364cf 0006040c USER32!UserCallWinProcCheckWow+0x150 (FPO: [Non-Fpo])
0012fc14 7e42a039 7e4364cf 0006040c 00000050 USER32!CallWindowProcAorW+0x98 (FPO: [6,0,0])
0012fc34 004c0e7d 7e4364cf 0006040c 00000050 USER32!CallWindowProcW+0x1b (FPO: [5,0,0])
0012fda4 004c0d80 02940d99 02bb9430 fffffffe audit4_home!Vcl.Controls.TWinControl.DefaultHandler+0xdd
0012fdf0 004c03d3 0012fe04 004c03eb 0012fe20 audit4_home!Vcl.Controls.TWinControl.WndProc+0x5b8
0012fe20 00467b3e 00000050 00000001 04090c09 audit4_home!Vcl.Controls.TWinControl.MainWndProc+0x2f
0012fe38 7e418734 0006040c 00000050 00000001 audit4_home!System.ClassesStdWndProc+0x16
0012fe64 7e418816 02940d99 0006040c 00000050 USER32!InternalCallWinProc+0x28
0012fecc 7e4189cd 00000000 02940d99 0006040c USER32!UserCallWinProcCheckWow+0x150 (FPO: [Non-Fpo])
0012ff2c 7e418a10 0012ff60 00000000 0006040c USER32!DispatchMessageWorker+0x306 (FPO: [Non-Fpo])
0012ff3c 005a6980 0012ff60 00120100 0012ff9c USER32!DispatchMessageW+0xf (FPO: [1,0,0])
0012ff4c 7c910222 0000000f 028458e0 005a69c3 audit4_home!Vcl.Forms.TApplication.ProcessMessage+0xf8
0012ff9c 016e6903 0012ffb0 016e6916 0012ffc0 ntdll!RtlpAllocateFromHeapLookaside+0x42 (FPO: [Non-Fpo])
0012ffc0 7c817077 7c910222 0000000f 7ffde000 audit4_home!Audit4_home.initialization+0xc3
0012fff0 00000000 016e6840 00000000 00000018 kernel32!BaseProcessStart+0x23 (FPO: [Non-Fpo])

Highlighted above there is a call to win32k!xxxSendMessage.  This is passed the address of a window at bbe83720.  And when we look at that, we see the following:

lkd> dd bbe83720
bbe83720  000d01fe 00000006 e2d75870 8998ae50

And window 000d01fe is the IME window for the other, deadlocked thread!  Why is the kernel sending a message to another thread here and now?  And without a timeout?  That’s our deadlock.  We know how it happens, but I at least still don’t know why!

Still, that’s a step forward.  Next was to figure out why we were getting that.  After a fair bit of exploration, I looked at little harder at the call stack, and decided to investigate the parameters for xxxActivateKeyboardLayout:

8989a2d0 04090c09 00000100 win32k!xxxActivateKeyboardLayout+0x4c 

That third parameter in xxxActivateKeyboardLayout corresponds to the Flags parameter for ActivateKeyboardLayout.  Perusing the documentation for the Windows ActivateKeyboardLayout function, we can see that 0x100 is KLF_SETFORPROCESS.  Bingo!  That sounds pretty suspicious!  I worked through the assembly code for xxxActivateKeyboardLayout and xxxRealDefWindowProc, and sure enough, that was it: when xxxRealDefWindowProc processes WM_INPUTLANGCHANGEREQUEST, it sets the Flags parameter to 0x0100:

win32k!xxxRealDefWindowProc+0x44d:
8211f1dd 56              push    esi
8211f1de 6800010000      push    100h
8211f1e3 ff7514          push    dword ptr [ebp+14h]
8211f1e6 6a00            push    0
8211f1e8 e8dfe40100      call    win32k!_GetProcessWindowStation (8213d6cc)
8211f1ed 50              push    eax
8211f1ee e838bdf9ff      call    win32k!xxxActivateKeyboardLayout (820baf2b)

In Windows Vista and later versions, this problem does not arise as frequently, because the Text Services Framework takes over the Alt+Left Shift command (before DefWindowProc gets a look-in) and calls ActivateKeyboardLayout without the KLF_SETFORPROCESS flag set.  However, the issue can still arise if you ever use that flag in your own code when calling ActivateKeyboardLayout.

To summarize, it’s helpful to remember that technically, we were using the functions as designed.  The primary issue is combination of design flaws, first, with how that KLF_SETFORPROCESS flag is handled: either the kernel code should be using a timeout when it sends the message to the other thread, or it should be queuing an event for the other thread to handle when it gets around to it; and second, with TThread.Synchronize, which unfortunately by design cannot be robust against deadlocks.

In any case, to make your own code more robust:

  1. Determine, when using synchronisation calls, if there is any chance that windows could be created by your thread, and if so, use MsgWaitForMultipleObjects and a message loop instead of WaitForSingleObject or WaitForMultipleObjects.  Check the libraries you are using, and if you are using COM, it’s safest to assume that windows will be created.
  2. Don’t use the TThread.Synchronize procedure.  Recent versions of Delphi include TThread.Queue, which is asynchronous, and so avoids this deadlock.
  3. Think carefully about whether a thread is the right solution to the problem.
  4. Don’t use the KLF_SETFORPROCESS flag!

Locating Delphi exceptions in a live session or dump using WinDbg

The offsets used in this blog are correct for Delphi XE2, and this information is only valid for x86.  You will have to plug in other values for other versions of Delphi.  You can find more details in my earlier Delphi WinDbg blog articles:

The following WinDbg command will return a list of all Delphi exception records located within the stacks of each thread in the process.  Delphi uses the exception code 0EEDFADE:

~*e s -d poi(@$teb+8) poi(@$teb+4) 0EEDFADE

If you just wanted to do the current thread, you would run:

s -d poi(@$teb+8) poi(@$teb+4) 0EEDFADE

What is teb?   It’s the Thread Environment Block.  The data at teb+8 and teb+4 are the current bottom of the stack and the top of the stack, respectively.

For example, when looking at a crash dump we received, we were able to spot exceptions in two different threads:

0:000&> ~*e s -d poi(@$teb+8) poi(@$teb+4) 0EEDFADE
0012e9ec  0eedfade 00000000 00000001 00000000  ................
0012ef6c  0eedfade 00000003 00000000 7586d36f  ............o..u
0012f3d4  0eedfade 00000001 00000000 7586d36f  ............o..u
0012f42c  0eedfade 00000001 00000007 0012f43c  ............<...
0012f6ec  0eedfade 00000003 00000000 7586d36f  ............o..u
0012fb70  0eedfade 00000001 00000000 7586d36f  ............o..u
0012fbc8  0eedfade 00000001 00000007 0012fbd8  ................
04a6f06c  0eedfade 00000003 00000000 7586d36f  ............o..u
04a6f4e8  0eedfade 00000001 00000000 7586d36f  ............o..u
04a6f540  0eedfade 00000001 00000007 04a6f550  ............P...
04a6f72c  0eedfade 00000003 00000000 7586d36f  ............o..u
04a6fb94  0eedfade 00000001 00000000 7586d36f  ............o..u
04a6fbec  0eedfade 00000001 00000007 04a6fbfc  ................

This has returned exception records in two different thread stacks (0012* and 04a6*).  We can see a number of potential exception records; some of these are not really records (because the 0EEDFADE value is not only used in the EXCEPTION_RECORD structure; it is also passed as a parameter to the RaiseException function among others).  However, if the 3rd DWORD shown is 0, then this is probably a real exception record, and not part of a function call.  Why this?  Because EXCEPTION_RECORD's third member is a pointed to a nested exception record, which it seems is always set to NULL in Delphi.

To examine the exception record run the following command:

0:000>.exr 0012ef6c
ExceptionAddress: 7586d36f (KERNELBASE!RaiseException+0x00000058)
   ExceptionCode: 0eedfade
  ExceptionFlags: 00000003
NumberParameters: 7
   Parameter[0]: 005f2d4c
   Parameter[1]: 08c92148
   Parameter[2]: 800a0e7f
   Parameter[3]: 005f2d4c
   Parameter[4]: 0299ef64
   Parameter[5]: 0012f48c
   Parameter[6]: 0012f458

To confirm that this is a real Delphi exception check two things:

  1. The ExceptionAddress should point to an address within the RaiseException function (that actual address may vary between versions of Windows).
  2. It should have 7 parameters:
0: code address where the exception was raised
1: address of the Exception object
2-6: additional data relating to the exception type and stored registers

Let's examine the Exception object:

0:000> dd 08c92148
08c92148  004c2134 08cbb20c 0012ee54 800a0e7f
08c92158  08cf3f94 080cb690 0000002a 0054e1a4
08c92168  08c99fec 0000005e 08134330 00000000
08c92178  00000000 00000024 00000032 00000100
08c92188  0000001a 00000003 0000000b 4473624f
08c92198  54657461 00656d69 0000002a 0054e1a4
08c921a8  08c99fec 0000005d 08134314 00000000
08c921b8  00000000 00000023 00000032 00000100

From here, we can use the same spelunking techniques as in my previous WinDbg articles:

0:000> da poi(poi(8c92148)-38)+1
004c214f  "EOleException.!L"
0:000> du poi(8c92148+4)
08cbb20c  "Operation cannot be performed wh"
08cbb22c  "ile executing asynchronously"

You can also skip examining the exception record if you want, with shortcuts such as:

da poi(poi(poi(0012ef6c+18))-38)+1; du poi(poi(0012ef6c+18)+4)

How about working with nested exceptions?  Take the following scratch program:

unit NestedExceptions;

interface

uses
  Winapi.Windows, Winapi.Messages, System.SysUtils, System.Variants, System.Classes, Vcl.Graphics,
  Vcl.Controls, Vcl.Forms, Vcl.Dialogs, Vcl.StdCtrls;

type
  TForm1 = class(TForm)
    Button1: TButton;
    procedure Button1Click(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

type
  EWhatAMess = class(Exception);

  EAnotherError = class(Exception);

  ESomeError = class(Exception)
  private
    FExtraData: string;
  public
    constructor Create(const Message, ExtraData: string);
  end;

procedure HandleThisOneToo;
begin
  raise EWhatAMess.Create('What a mess');
end;

procedure HandleIt;
begin
  try
    raise EAnotherError.Create('Another Error Message');
  except
    on E:EAnotherError do
      HandleThisOneToo;
  end;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  try
    raise ESomeError.Create('Some Error Message', 'Here''s some extra data');
  except
    on E:ESomeError do
      HandleIt;
  end;
end;

{ ESomeError }

constructor ESomeError.Create(const Message, ExtraData: string);
begin
  FExtraData := ExtraData;
  inherited Create(Message);
end;

end.

Build this program, then load it up in WinDbg.  You'll need to enable the event filter for 0EEDFADE as per my previous blog.  Click the bad, bad button and watch as the exceptions are thrown.  For the first two exceptions, just g.  On the third exception, we'll spelunk with the search technique.

0:000> ~*e s -d poi(@$teb+8) poi(@$teb+4) 0EEDFADE
0018e814  0eedfade 00000001 00000000 754ab9bc  ..............Ju
0018e86c  0eedfade 00000001 00000007 0018e87c  ............|...
0018e9ac  0eedfade 00000003 00000000 754ab9bc  ..............Ju
0018ee60  0eedfade 00000001 00000000 754ab9bc  ..............Ju
0018eeb8  0eedfade 00000001 00000007 0018eec8  ................
0018f014  0eedfade 00000003 00000000 754ab9bc  ..............Ju
0018f4c8  0eedfade 00000001 00000000 754ab9bc  ..............Ju
0018f520  0eedfade 00000001 00000007 0018f530  ............0...

The first instance of an exception in the stack will have the Exception Flag 00000001.  This is the one we are interested in, in each case.  Let's look at them:

0:000> da poi(poi(poi(0018e814+18))-38)+1; du poi(poi(0018e814+18)+4)
0051138f  "EWhatAMess"
02548654  "What a mess"
0:000> da poi(poi(poi(0018ee60+18))-38)+1; du poi(poi(0018ee60+18)+4)
00511437  "EAnotherErrorH.Q"
0256c36c  "Another Error Message"
0:000> da poi(poi(poi(0018f4c8+18))-38)+1; du poi(poi(0018f4c8+18)+4)
00511523  "ESomeErrorJ"
02581d3c  "Some Error Message"

You may find that some exception records are no longer valid as they can be overwritten over time.  This happens for nested exceptions, unfortunately, if you don't actually break on the exception in WinDbg before it is handled in the application (which will typically be the case if you attach a debugger to a process with an exception dialog visible).  In this situation, only the final exception record will point to a live Exception object. You may notice that innermost exception had a little bit of extra data.  How do we pull that out?  Let's look at the ESomeError object in memory:

0:000> dd poi(18f4c8+18)
02548698  005114d4 02581d3c 00000000 00000000
025486a8  00000000 00000000 0256c32c 00000000
025486b8  00000073 00000000 00000000 00000000
025486c8  00000000 00000000 00000000 00000000
025486d8  00000000 00000000 00000000 00000000
025486e8  00000000 00000000 00000000 00000000
025486f8  00000000 00000000 00000000 00000000
02548708  00000000 00000000 00000000 00000000

What have we got here?  Breaking that data down, we have:

02548698 005114d4 Pointer to class
+0000004 02581d3c Pointer to Exception.FMessage Unicode string
+0000008 00000000 Exception.FHelpContext
+000000C 00000000 Exception.FInnerException
+0000010 00000000 Exception.FStackInfo
+0000014 00000000 Exception.FAcquireInnerException

That InnerException data would be useful — but it is not commonly used, yet. And then we have the data for ESomeError:

+0000018 0256c32c ESomeError.FExtraData

Examining that:

0:000> du 256c32c
0256c32c  "Here's some extra data"

And there you have it. Knock yourself out!

When characters go astray: diagnosing missing characters when printing with IE9

I was recently landed with a real doozy of a support case: on some systems, under some circumstances, print jobs would be missing random characters.  Like the following image:

print-1

As opposed to how it should appear:

print-2

This example is missing the following letters:  Å ê Ö q Q

In this particular example, you may be misled into thinking the problem is just with non-ASCII characters.  But note that both upper case Q and lower case q are missing as well.  Other example documents that we collected had no non-ASCII characters in them at all.  I just used this example because it was one of the few we captured with fake data as opposed to real customer data.

This problem was not particularly printer dependent, and could be replicated (thank goodness!) with a PDF print driver.  But the circumstances behind replicating the problem were a little strange: it only occurred, in the second document printed, when more than one print job was sent to the printer in quick succession.  Some combinations of documents did not reflect the issue.  If we sent the documents in the opposite order, they would often succeed.  We were unable to replicate the problem on the printer in our office.

After many tries, I finally managed to replicate the issue on my development machine.  Now I could break out the debugging tools and start some tracing.

I first used Process Monitor to capture details about temporary files in the print process.  A lot of temporary files were generated.  I examined many of them, but the files appeared intact: the problem was not reflected at this point in the print job.

print-3

Next I attached a debugger to the process to try and figure out if I could spot any issues during the print itself.  I put a breakpoint on GDI32!StartDocW and when that fired, enabled a breakpoint on GDI32!ExtTextOutW.  This showed me all the strings that were being sent to the page during the print job.

Unfortunately, Internet Explorer 9 uses a process of printing to XPS and then converting that output to GDI commands for printing.  The library which does the conversion from XPS to GDI used the flag ETO_GLYPH_INDEX when calling ExtTextOutW.  This meant that examining the data in the debugger was pretty much a non-starter, without any easy to spot strings.  So eventually I decided to log the whole lot to a text file, and pull the TrueType font glyph indices in the hope that the information in the TTF file would be sufficient to reconstruct the original text strings.

As it turned out, I was trivially able to match that data up to the XPS data by generating the report as an XPS.  These are just a ZIP file with fonts, images and XML rendering data embedded.  Easy to spelunk.  It certainly seems that the correct glyph indices were being sent to the print engine.  I matched these up between the data captured in the debugger as follows.

The text string:

Dr Åbrique T Ömbê

The XPS element:

Pulling out just the Glyph indices from the XPS element:

39;85;3;99;69;85;76;84;88;72;3;55;3;103;80;69;114

Converting those glyph indices to hexadecimal:

0027 0055 0003 0063 0045 0055 004c 0054
0058 0048 0003 0037 0003 0067 0050 0045
0072

Matching the data captured from ExtTextOutW:

233a4fe0  00550027 00630003 00550045 0054004c
233a4ff0  00480058 00370003 00670003 00450050
233a5000  abab0072

Note that the DWORDs are reversed because Intel uses little endian byte order (and abab is just garbage past the end of the array).

This suggested that the font data itself is being corrupted.  Next, to check the font data in the .ps file as it is generated!  Helpfully, PrimoPDF stores a .ps file in a temporary folder during the conversion to PDF.  So it was just a matter of grabbing that file and looking at that.  Well, of course PostScript and RTF are near the top of the list of painfully unreadable file formats, but a little trial and error (and GhostScript) found me the following:

779 1636 M [66 31 23 66 46 30 26 46 45 41 23 56 23 66 71 46  0]xS

Of which, 080B03380E0B1B3905090322033A0D0E3B should translate to Dr Åbrique T Ömbê:

08 0B 03 38 0E 0B 1B 39 05 09 03 22 03 3A 0D 0E 3B
D  r     Å  b  r  i  q  u  e     T     Ö  m  b  ê

Interestingly, the missing glyphs are all in the range 38-3B in this example.  This line is identical in a version of the file that printed without errors, but the font data is quite different.

Right, so we know data corruption is occurring in the font data.  But we knew that data corruption was the issue already.  So where are we now?  Have I accomplished anything by going down this track?  Yes!  I know now that it is happening in the conversion of an XPS document to GDI, in the print process.  Given this and that it happens only when 2 documents are printed, it is almost certainly a threading issue in the XPS to GDI library.

Armed with that new-found knowledge, I set out to build a minimal test application.  Here’s the very hacky test case I ended up with:

#include "stdafx.h"

int Print(LPWSTR inFile, LPWSTR outFile, LPWSTR printer);

int _tmain(int argc, _TCHAR* argv[]) {
  if(argc < 2) {
    printf("Usage: XpsThread Apartment|Multi [PathToXPSFiles] [Printer]\n");
    return 1;
  }

  if(_wcsicmp(argv[1], L"Apartment") == 0)
    CoInitializeEx(0, COINIT_APARTMENTTHREADED);
  else
    CoInitializeEx(0, COINIT_MULTITHREADED);

  /* This assumes 2 files named 0.xps and 1.xps.  No need to make more complex at this stage. */

  WCHAR inFile[MAX_PATH], outFile[MAX_PATH];

  PWCHAR printer = argc > 3 ? argv[3] : L"PrimoPDF";

  for(int i = 0; i < 2; i++)
  {
    wsprintf(inFile, L"%s%s%d.xps", argc > 2 ? argv[2] : L"", argc > 2 ? (*(wcschr(argv[2],0)-1) == '\\' ? L"" : L"\\") : L"", i);
    wsprintf(outFile, L"%s%s%d.ps", argc > 2 ? argv[2] : L"", argc > 2 ? (*(wcschr(argv[2],0)-1) == '\\' ? L"" : L"\\") : L"", i);

    Print(inFile, outFile, printer);
  }

  getchar();  // Manually wait for print jobs to finish.

  CoUninitialize();
  // Examine the .ps files in GSview or whatever takes your fancy.  The second file generated will often be corrupt if COINIT_MULTITHREADED is used.
  return 0;
}

int Print(LPWSTR inFile, LPWSTR outFile, LPWSTR printer) {
  IXpsPrintJobStream *docStream = NULL;
  HRESULT hr;

  hr = StartXpsPrintJob(printer, NULL, outFile, 0, 0, NULL, 0, NULL, &docStream, NULL);
  if(SUCCEEDED(hr)) {
    FILE *fp = _wfopen(inFile, L"rb");
    BYTE buf[512];
    int n = fread(buf, 1, 512, fp);
    ULONG sz;
    while(n > 0) {
      docStream->Write(buf, n, &sz);
      n = fread(buf, 1, 512, fp);
    }
    fclose(fp);

    hr = docStream->Close();
    if(SUCCEEDED(hr)) {
      wprintf(L"Succeeded: %s\n", inFile);
      return 0;
    }
  }

  wprintf(L"Failed to print %s: %x\n", inFile, hr);
  return 1;
}

It may have been a little hacky, but it proved the bug.  It turns out, printing multiple documents at once with COM initialised with the flag COINIT_MULTITHREADED will consistently fail, even though that is the mode used in the example in MSDN.

Okay, so let’s take that information back to our application in Windbg, and see if we can find anything in MSHTML’s printing.  I popped in a couple of breakpoints:

bp coinitializeex ".echo ---CoInitializeEx---; ~.; dd esp+8 L1; gc"
bp gdi32!startdocw ".echo ---StartDocW---; ~.;"

The first would show me the thread ID and the COINIT flags issued every time CoInitializeEx was called.  The second would break when StartDoc was called to start a print job, from the XPS printing code (XpsGdiConverter.dll).  This would tell me the threading mode for the thread.  And as I was now expecting, MSHTML uses COINIT_MULTITHREADED (= 00000000).

---CoInitializeEx---
.  2  Id: 1464.f80 Suspend: 1 Teb: 7efd7000 Unfrozen
Start: mshtml!ShowModelessHTMLDialog+0x5460 (6dc0d50e)
Priority: 0  Priority class: 32  Affinity: f
189ef0b4  00000000
---StartDocW---
.  2  Id: 1464.f80 Suspend: 1 Teb: 7efd7000 Unfrozen
Start: mshtml!ShowModelessHTMLDialog+0x5460 (6dc0d50e)
Priority: 0  Priority class: 32  Affinity: f

Interestingly, when I was researching XPS printing in MSDN, I came across a page in the .NET Framework documentation which contains the following detail:

The three-parameter AddJob(String, String, Boolean) overload of AddJob must run in a single thread apartment whenever the Boolean parameter is false, which it must be when a non-XPSDrv printer is being used. However, the default apartment state for Microsoft .NET is multiple thread. This default must be reversed since the example assumes a non-XPSDrv printer

There are two ways to change the default. One way is to simply add the STAThreadAttribute (that is, “[System.STAThreadAttribute()]”) just above the first line of the application’s Main method (usually “static void Main(string[] args)”). However, many applications require that the Main method have a multi-threaded apartment state, so there is a second method: put the call to AddJob(String, String, Boolean) in a separate thread whose apartment state is set to STA with SetApartmentState. The example below uses this second technique.

This little detail is not in the Win32 XPS documentation. I can’t prove that this is the cause of the problem but it certainly seems suspicious that this goes wrong with MSHTML.  I tried a very naughty test whereby I overrode the COINIT flags whenever CoInitializeEx was called (WinDbg: bp ole32!coinitializeex “ed esp+8 2; gc”), and the problem “went away” … it’s a nice finger-in-the-wind test but certainly not conclusive!

The problem does not occur on the printer in our office, because it is an XPSDrv printer.  The problem only occurs with drivers that do not support XPS, because XPSDrv does not require apartment threading.

Anyway, I think it’s time to take this case to Microsoft.  In the meantime, the safest workaround is to wait for the first print job to finish before starting a subsequent one.  We’ll probably put that fix in for now, even though it makes the print tediously slow!

Plotting Pretty Elevation Profiles with the Strava API

Update 13 July 2013: After the deprecation of the Strava API, this tool no longer works. However, an updated version of this segment gradient tool is now available here.

So this tool plots elevation profiles from Strava data. OK, so maybe the profiles aren’t amazingly pretty.  But I had fun making them look somewhat like the elevation profiles from a certain famous cycling event!

Lipscome + Nicholas climb

This little hacked-together piece of JavaScript will plot the elevation from a Strava segment (in metric units only, of course!) and uses the familiar green-blue-red-black styling to represent the severity of the gradient.

I wrote the code to fulfill a specific purpose: generating graphs for the Hobart 10,000 ride.  But I figured I’d make the code and tool available for anyone to use or fiddle with as they see fit.

This tool requires IE9, Firefox, Safari, Chrome or any other canvas-aware browser.  If you plug in bad data, you’ll get bad results.  So don’t.  All the parameters dynamically refresh the profile, except for the segment ID field, after which you’ll need to click Load Segment.

If you want to play with the source yourself, the only thing you need to do server-side is plug in the data from: http://www.strava.com/api/v1/stream/segments/segmentid

Go knock yourself out here: http://hobart10000.com/segment-gradient.php

Updated 15 Aug 2012: The tool now does isometric projection, which I think looks quite a lot nicer, and I’ve tidied up the user interface and added a few more controls.  As noted by Jonathan in the comments, it doesn’t do too well with downhill segments — the tool assumes it is an uphill segment at present. 

Indy, TIdURI.PathEncode, URLEncode and ParamsEncode and more

Frequently in Delphi we come across the need to encode a string to stuff into a URL query string parameter (as per web forms).  One would expect that Indy contains well-tested functions to handle this.  Well, Indy contains some functions to help with this, but they may not work quite as you expect.  In fact, they may not be much use at all.

Indy contains a component called TIdURI.  It contains, among other things, the member functions URLEncode, PathEncode, and ParamsEncode. At first glance, these seem to do what you would need.  But in fact, they don’t.

URLEncode will take a full URL, split it into path, document and query components, encode each of those, and return the full string.  PathEncode is intended to handle the nuances of the path and document components of the URL, and ParamsEncode handles query strings.

Sounds great, right?  Well, it works until you have a query parameter that has an ampersand (&) in it.  Say my beloved end user want to search for big&little.  It seems that you could pass the following in:

s := TIdURI.URLEncode('http://www.google.com/search?q='+SearchText);

But then we get no change in our result:

s = 'http://www.google.com/search?q=big&little';

And you can already see the problem: little is now a separate parameter in the query string.  How can we work around this?  Can we pre-encode ampersand to %26 before you pass in the parameters?

s := TIdURI.URLEncode('http://www.google.com/search?q='+ReplaceStr(SearchText, '&', '%26'));

No:

s = 'http://www.google.com/search?q=big%25%26little';

And obviously we can’t do it ourselves afterwards, because we too won’t know which ampersands are which.  You could do correction of ampersand by encoding each parameter component separately and then post-processing the component for ampersand and other characters before final assembly using ParamsEncode. But you’ll soon find that it’s not enough anyway.  =, / and ? are also not encoded, although they should be.  Finally, URLEncode does not support internationalized domain names (IDN).

Given that these functions are not a complete solution, it’s probably best to avoid them altogether.

The problem is analogous to the Javascript encodeURI vs encodeURIComponent issue.

So to write your own…  I haven’t found a good Delphi solution online (and I searched a bit), so here’s a function I’ve cobbled together (use at your own risk!) to encode parameter names and values. You do need to encode each component of the parameter string separately, of course.

function EncodeURIComponent(const ASrc: string): UTF8String;
const
  HexMap: UTF8String = '0123456789ABCDEF';

  function IsSafeChar(ch: Integer): Boolean;
  begin
    if (ch >= 48) and (ch <= 57) then Result := True // 0-9
    else if (ch >= 65) and (ch <= 90) then Result := True // A-Z
    else if (ch >= 97) and (ch <= 122) then Result := True // a-z
    else if (ch = 33) then Result := True // !
    else if (ch >= 39) and (ch <= 42) then Result := True // '()* 
    else if (ch >= 45) and (ch <= 46) then Result := True // -.
    else if (ch = 95) then Result := True // _
    else if (ch = 126) then Result := True // ~
    else Result := False;
  end;
var
  I, J: Integer;
  ASrcUTF8: UTF8String;
begin
  Result := '';    {Do not Localize}

  ASrcUTF8 := UTF8Encode(ASrc);
  // UTF8Encode call not strictly necessary but
  // prevents implicit conversion warning

  I := 1; J := 1;
  SetLength(Result, Length(ASrcUTF8) * 3); // space to %xx encode every byte
  while I <= Length(ASrcUTF8) do
  begin
    if IsSafeChar(Ord(ASrcUTF8[I])) then
    begin
      Result[J] := ASrcUTF8[I];
      Inc(J);
    end
    else
    begin
      Result[J] := '%';
      Result[J+1] := HexMap[(Ord(ASrcUTF8[I]) shr 4) + 1];
      Result[J+2] := HexMap[(Ord(ASrcUTF8[I]) and 15) + 1];
      Inc(J,3);
    end;
    Inc(I);
  end;

  SetLength(Result, J-1);
end;

To use this, do something like the following:

function GetAURL(const param, value: string): UTF8String;
begin
  Result := 'http://www.example.com/search?'+
    EncodeURIComponent(param)+
    '='+
    EncodeURIComponent(value);
end;

Hope this helps. Sorry, I haven’t got an IDN solution in this post!

Updated 15 Nov 2018: Fixed bug with handling of space (should output %20, not +).

Understanding and correcting interface reference leaks in Delphi’s Vcl.OleCtrls.pas

Update 21 Sep 2015: This bug has been fixed in Delphi 10 Seattle.

I have spent quite some time recently tracing a memory leak in a Delphi application.  It is quite a large application and makes a lot of use of embedded MSHTML (TWebBrowser and TEmbeddedWB) controls for presentation.  Somehow, somewhere, we were leaking memory: users were reporting that after a few hours of use, the application slowed down, and checking Task Manager certainly reflected excessive memory usage.

The normal procedure to reproduce the memory leak was followed, including using tools such as AQtime and other debug logging tools.  However, no leaks were detected using these tools, although we could see the memory usage increasing in Task Manager on our test machines.  This suggested the memory we were leaking was not allocated by Delphi code: i.e. it was Windows or a 3rd party DLL.  This doesn’t mean, of course, that it wasn’t our fault — just that it wasn’t allocated directly from Delphi source!

At this point, I was asked to trace this issue further.  I ran Performance Monitor with a couple of key counters: Handle Count and Working Set.  Running the test in question (involving opening and closing a window with an embedded web browser control) showed a gradual increase in both handle count and working set size.  However, Performance Monitor unfortunately does not include a user handle (i.e. window handles) counter.  It was when I noticed in Process Explorer that user handles were also increasing that I got my first break in tracing the cause.

It turned out that the embedded web browser window was not always being destroyed when its parent window was.  This window had the class name “Internet Explorer_Server”.

With a bit more tracing, I found that the trigger was the Document or Application properties.  If the Document property of the TWebBrowser control was ever referenced, the window was never destroyed (note, there are also some other properties that trigger the same behaviour — look for properties returning type IDispatch).

This set me to researching the Document property.  It looks like this:

property Document: IDispatch index 203 read GetIDispatchProp;

Looking at GetIDispatchProp in Vcl.OleCtrls.pas, we see the following code:

function TOleControl.GetIDispatchProp(Index: Integer): IDispatch;
var
  Temp: TVarData;
begin
  GetProperty(Index, Temp);
  Result := IDispatch(Temp.VDispatch);
end;

And here some alarm bells go off.  Delphi, rather nicely, manages all the reference counting on interfaces.  This works pretty smoothly, until you trick the compiler by casting other types to interfaces.  Here the code in question is triggering an invisible call to IntfCopy in the line:

Result := IDispatch(Temp.VDispatch);

The IntfCopy function internally calls  _AddRef on the object, but because of the cast to IDispatch from a Pointer, this new reference is never released.  The fix is to change the function (and the similar GetIUnknownProp function) to:

function TOleControl.GetIDispatchProp(Index: Integer): IDispatch;
var
  Temp: TVarData;
begin
  GetProperty(Index, Temp);
  Pointer(Result) := Temp.VDispatch;
end;

function TOleControl.GetIUnknownProp(Index: Integer): IUnknown;
var
  Temp: TVarData;
begin
  GetProperty(Index, Temp);
  Pointer(Result) := Temp.VUnknown;
end;

By casting this way, we avoid the call to IntfCopy and hence the call to _AddRef.  Alternatively, you could have called _Release on Result, but this would require another test to ensure that Result wasn’t nil (and also, of course, redundant _AddRef and _Release calls).

It turns out that this problem was identified way back in 1999 and the workaround has been commonly referenced since then.  So I am not taking credit for the fix here!  And yet it is still unresolved in Delphi XE2 — and so still causing trouble for Delphi programmers today!  There are no clear references to the problem in QualityCentral that I could find (that’s about to change!)

But don’t stop reading yet!

“Now my app crashes”

There are frequent complaints online that this fix results in crashes.  This is because other developers have engineered fixes to this reference leak in places where these GetIDispatchProp and/or GetIUnknownProp calls are made, rather than where the problem actually occurs.  I have found this in TEmbeddedWB.  TEmbeddedWB is a web browser hosting component that takes up where TWebBrowser leaves off, and it does fix a lot of the limitations of TWebBrowser.

But here are the places in the TEmbeddedWB source that you’ll need to “unfix” once you fix the root problem in Vcl.OleCtrls.pas:

EmbeddedWB.pas [2603]: (procedure TEmbeddedWB.SetUserAgentInt): Delete _Release call
EwbCore.pas [1367]: (procedure TCustomEmbeddedWB.SetDesignMode): Delete _Release call
EwbCore.pas [1404]: (procedure TCustomEmbeddedWB.SetDownloadOptions): Delete _Release call
EwbCore.pas [1468]: (procedure TCustomEmbeddedWB.SetUserInterfaceOptions): Delete _Release call
EwbTools.pas [1015]: (function GetBmpFromBrowser): Delete _Release call
EwbTools.pas [3164]: (function InvokeCMD): Delete _Release call

Note, you’ll also see an experimental fix buried — and disabled — in the TEmbeddedWB code (but without fixing the lines above), and without a whole lot of documentation as to why!

I have created a Quality Central report, along with a test case and example fix.  I also checked the Delphi VCL source, and about 30 other components that we use, and found no more calls to _Release relating to this issue.

Delphi XE2’s hidden hints and warnings options

There are a number of warnings available in Delphi XE2 that are not very well documented.  While you can control them in the Project options dialog, and you can turn them on using {$WARN} directives or in command line compiler options, the documentation for the warning identifiers is currently pretty piecemeal, and there is no clear link between the project options, warning directive identifiers, and numeric identifiers.

The {$WARN} compiler directive takes effect only in the unit it is in.  It overrides project settings and command line compiler flags.  To turn a warning on or off for a block of code (although you’ll have to work hard to convince me that this is an acceptable solution 99% of the time):

{$WARN SYMBOL_DEPRECATED ON}
{$WARN SYMBOL_DEPRECATED OFF}

To restore the warning to its previous setting:

{$WARN SYMBOL_DEPRECATED DEFAULT}

And finally, you can convert a warning into an error:

{$WARN SYMBOL_DEPRECATED ERROR}

To change the setting in your project, go to Project|Options, Delphi Compiler, Hints and Warnings, and expand the Output warnings setting:


Delphi’s project options dialog – hints and warnings

To change the setting in the command line compiler, use the -W flag, for example:

dcc32 -W+SYMBOL_DEPRECATED myproject.dpr
dcc32 -W-SYMBOL_DEPRECATED myproject.dpr

Or to turn the warning into an error:

dcc32 -W^SYMBOL_DEPRECATED myproject.dpr

Note, if you type this from the Windows command prompt, you will have to repeat the ^ symbol, as ^ is an escape symbol:

dcc32 -W^^SYMBOL_DEPRECATED myproject.dpr

These are the hints and warning identifiers I am aware, along with what I understand are the default settings and numeric identifiers, and links to the relevant help topics.

ID Directive ID Default Description Notes
W1000 SYMBOL_DEPRECATED OFF Symbol ‘%s’ is deprecated (Delphi)
W1001 SYMBOL_LIBRARY OFF Symbol ‘%s’ is specific to a library (Delphi)
W1002 SYMBOL_PLATFORM OFF Symbol ‘%s’ is specific to a platform (Delphi)
W1003 SYMBOL_EXPERIMENTAL ON Symbol ‘%s’ is experimental (Delphi)
W1004 UNIT_LIBRARY OFF Unit ‘%s’ is specific to a library (Delphi)
W1005 UNIT_PLATFORM OFF Unit ‘%s’ is specific to a platform (Delphi)
W1006 UNIT_DEPRECATED OFF Unit ‘%s’ is deprecated (Delphi)
W1007 UNIT_EXPERIMENTAL ON Unit ‘%s’ is experimental (Delphi)
W1008 HRESULT_COMPAT ON Integer and HRESULT interchanged
W1009 HIDING_MEMBER ON Redeclaration of ‘%s’ hides a member in the base class (Delphi)
W1010 HIDDEN_VIRTUAL ON Method ‘%s’ hides virtual method of base type ‘%s’ (Delphi)
W1011 GARBAGE ON Text after final ‘END.’ – ignored by compiler (Delphi)
W1012 BOUNDS_ERROR ON Constant expression violates subrange bounds
W1013 ZERO_NIL_COMPAT ON Constant 0 converted to NIL (Delphi)
W1014 STRING_CONST_TRUNCED ON String constant truncated to fit STRING%ld (Delphi)
W1015 FOR_LOOP_VAR_VARPAR ON FOR-Loop variable ‘%s’ cannot be passed as var parameter (Delphi)
W1016 TYPED_CONST_VARPAR ON Typed constant ‘%s’ passed as var parameter (Delphi)
W1017 ASG_TO_TYPED_CONST ON Assignment to typed constant ‘%s’ (Delphi)
W1018 CASE_LABEL_RANGE ON Case label outside of range of case expression (Delphi)
x1019 FOR_VARIABLE ON For loop control variable must be simple local variable (Delphi)
x1020 CONSTRUCTING_ABSTRACT ON Constructing instance of ‘%s’ containing abstract method ‘%s.%s’ (Delphi)
W1021 COMPARISON_FALSE ON Comparison always evaluates to False (Delphi)
W1022 COMPARISON_TRUE ON Comparison always evaluates to True (Delphi)
W1023 COMPARING_SIGNED_UNSIGNED ON Comparing signed and unsigned types – widened both operands (Delphi)
W1024 COMBINING_SIGNED_UNSIGNED ON Combining signed and unsigned types – widened both operands (Delphi)
x1025 UNSUPPORTED_CONSTRUCT ON Unsupported language feature: ‘%s’ (Delphi)
x1026 FILE_OPEN ON File not found ‘%s’ (Delphi) Despite being listed as a warning, turning this off appears to have no effect.
F1027 FILE_OPEN_UNITSRC ON Unit not found ‘%s’ or binary equivalents (%s) (Delphi)
x1028 BAD_GLOBAL_SYMBOL ON Bad global symbol definition ‘%s’ in object file ‘%s’ (Delphi)
W1029 DUPLICATE_CTOR_DTOR ON Duplicate %s ‘%s’ with identical parameters will be inacessible from C++ (Delphi)
x1030 INVALID_DIRECTIVE ON Invalid compiler directive – ‘%s’ (Delphi)
W1031 PACKAGE_NO_LINK ON Package ‘%s’ will not be written to disk because -J option is enabled (Delphi)
W1032 PACKAGED_THREADVAR ON Exported package threadvar ‘%s.%s’ cannot be used outside of this package (Delphi)
W1033 IMPLICIT_IMPORT ON Unit ‘%s’ implicitly imported into package ‘%s’ (Delphi)
W1034 HPPEMIT_IGNORED ON $HPPEMIT ‘%s’ ignored (Delphi)
W1035 NO_RETVAL ON Return value of function ‘%s’ might be undefined (Delphi)
W1036 USE_BEFORE_DEF ON Variable ‘%s’ might not have been initialized (Delphi)
W1037 FOR_LOOP_VAR_UNDEF ON FOR-Loop variable ‘%s’ may be undefined after loop (Delphi)
E1038 UNIT_NAME_MISMATCH ON Unit identifier ‘%s’ does not match file name (Delphi)
W1039 NO_CFG_FILE_FOUND ON No configuration files found (Delphi)
W1040 IMPLICIT_VARIANTS ON Implicit use of Variants unit (Delphi)
W1041 UNICODE_TO_LOCALE ON Error converting Unicode char to locale charset. String truncated. Is your LANG environment variable set correctly (Delphi)
W1042 LOCALE_TO_UNICODE ON Error converting locale string ‘%s’ to Unicode. String truncated. Is your LANG environment variable set correctly (Delphi)
W1043 IMAGEBASE_MULTIPLE ON Imagebase $%X is not a multiple of 64k. Rounding down to $%X (Delphi)
W1044 SUSPICIOUS_TYPECAST ON Suspicious typecast of %s to %s (Delphi)
W1045 PRIVATE_PROPACCESSOR ON Property declaration references ancestor private ‘%s.%s’ (Delphi)
W1046 UNSAFE_TYPE OFF Unsafe type ‘%s%s%s’ (Delphi)
W1047 UNSAFE_CODE OFF Unsafe code ‘%s’ (Delphi)
W1048 UNSAFE_CAST OFF Unsafe typecast of ‘%s’ to ‘%s’ (Delphi)
W1049 OPTION_TRUNCATED ON value ‘%s’ for option %s was truncated (Delphi)
W1050 WIDECHAR_REDUCED ON WideChar reduced to byte char in set expressions (Delphi)
W1051 DUPLICATES_IGNORED ON Duplicate symbol names in namespace. Using ‘%s.%s’ found in %s. Ignoring duplicate in %s (Delphi)
W1052 UNIT_INIT_SEQ ON Can’t find System.Runtime.CompilerServices.RunClassConstructor. Unit initialization order will not follow uses clause order Does not seem to be documented in XE2
W1053 LOCAL_PINVOKE ON Local PInvoke code has not been made because external routine ‘%s’ in package ‘%s’ is defined using package local types in its custom attributes Does not seem to be documented in XE2
x1054 MESSAGE_DIRECTIVE ON %s (Delphi) User-defined warning messages (see below). Turns off message hints as well but not message errors.
W1055 TYPEINFO_IMPLICITLY_ADDED ON Published caused RTTI ($M+) to be added to type ‘%s’ (Delphi)
x1056 RLINK_WARNING ON Duplicate resource Type %s, ID %s; File %s resource kept; file %s resource discarded (Delphi)
W1057 IMPLICIT_STRING_CAST ON Implicit string cast from ‘%s’ to ‘%s’ (Delphi)
W1058 IMPLICIT_STRING_CAST_LOSS ON Implicit string cast with potential data loss from ‘%s’ to ‘%s’ (Delphi)
W1059 EXPLICIT_STRING_CAST OFF Explicit string cast from ‘%s’ to ‘%s’ (Delphi)
W1060 EXPLICIT_STRING_CAST_LOSS OFF Explicit string cast with potential data loss from ‘%s’ to ‘%s’ (Delphi)
W1061 CVT_WCHAR_TO_ACHAR ON W1061 Narrowing given WideChar constant (‘%s’) to AnsiChar lost information (Delphi)
W1062 CVT_NARROWING_STRING_LOST ON Narrowing given wide string constant lost information (Delphi)
W1063 CVT_ACHAR_TO_WCHAR ON Widening given AnsiChar constant (‘%s’) to WideChar lost information (Delphi)
W1064 CVT_WIDENING_STRING_LOST ON Widening given AnsiString constant lost information (Delphi)
W1066 NON_PORTABLE_TYPECAST ON Lost Extended floating point precision. Reduced to Double (Delphi)
W1201 XML_WHITESPACE_NOT_ALLOWED ON XML comment on ‘%s’ has badly formed XML-‘Whitespace is not allowed at this location.’ (Delphi)
W1202 XML_UNKNOWN_ENTITY ON XML comment on ‘%s’ has badly formed XML- ‘Reference to undefined entity ‘%s (Delphi)
W1203 XML_INVALID_NAME_START ON XML comment on ‘%s’ has badly formed XML-‘A name was started with an invalid character.’ (Delphi)
W1204 XML_INVALID_NAME ON XML comment on ‘%s’ has badly formed XML-‘A name contained an invalid character.’ (Delphi)
W1205 XML_EXPECTED_CHARACTER ON XML comment on ‘%s’ has badly formed XML-‘The character ‘%c’ was expected.’ (Delphi)
W1206 XML_CREF_NO_RESOLVE ON XML comment on ‘%s’ has cref attribute ‘%s’ that could not be resolved (Delphi)
W1207 XML_NO_PARM ON XML comment on ‘%s’ has a param tag for ‘%s’, but there is no parameter by that name (Delphi)
W1208 XML_NO_MATCHING_PARM ON Parameter ‘%s’ has no matching param tag in the XML comment for ‘%s’ (but other parameters do) (Delphi)

The $MESSAGE directive allows generation of H1054, W1054, E1054 and F1054 messages.  User-defined hint and warning messages can be turned off with {$WARN MESSAGE_DIRECTIVE OFF} but logically enough, user-defined error and fatal messages can not be disabled.  For example, add the following line to MyUnit.pas:

{$MESSAGE WARN 'This is a user warning'}

Which will gives a compiler warning similar to the following:

[DCC Warning] MyUnit.pas(25): W1054 This is a user warning

Finally, I’m not going to discuss the $WARNINGS directive, except to say that I don’t think it’s ever necessary or sensible to turn it off.  If you must turn a warning off for a specific section of code, turn just that warning off as discussed above.  Using {$WARNINGS OFF} is just as unhelpful as an empty except block.

Updates to kmwString functions

A couple of weeks ago I published on this blog a set of functions for handling supplementary characters in JavaScript.  We have since fixed a few bugs in the code and have now made the kmwString script available on github at https://github.com/tavultesoft/keyman-tools

This update:

  • Fixes an error which could occur when only 1 parameter was passed to kmwSubstring and that parameter was negative
  • Fixed an incorrect NaN return value for kmwCodePointToCodeUnit when passed an offset at the very end of a string
  • Adds kmwCodeUnitToCodePoint function
  • Adds kmwCharAt function — which is basically just kmwSubstr(index,1)

Again, suggestions, comments, fixes and flames all gratefully received.

WinDBG and Delphi exceptions in x64

I recently was asked whether the Delphi exception event filter for WinDBG that I wrote about would also work with x64 Delphi applications.  The answer was no, it wouldn’t work, but that made me curious to find out what was different with x64.  I knew x64 exception handling was completely different to x86, being table based instead of stack based, but I wasn’t sure how much of this would be reflected in the event filter.

The original post contains the details about how the exception record was accessible at a known location on the stack, and how we could dig in from there.

Before firing up WinDBG, I had a look at System.pas, and found the x64 virtual method table offsets.  I have highlighted the key field we want to pull out:

{ Virtual method table entries }
{$IF defined(CPUX64)}
vmtSelfPtr           = -176;
vmtIntfTable         = -168;
vmtAutoTable         = -160;
vmtInitTable         = -152;
vmtTypeInfo          = -144;
vmtFieldTable        = -136;
vmtMethodTable       = -128;
vmtDynamicTable      = -120;
vmtClassName         = -112;
vmtInstanceSize      = -104;
vmtParent            = -96;
vmtEquals            = -88 deprecated 'Use VMTOFFSET in asm code';
vmtGetHashCode       = -80 deprecated 'Use VMTOFFSET in asm code';
vmtToString          = -72 deprecated 'Use VMTOFFSET in asm code';
vmtSafeCallException = -64 deprecated 'Use VMTOFFSET in asm code';
vmtAfterConstruction = -56 deprecated 'Use VMTOFFSET in asm code';
vmtBeforeDestruction = -48 deprecated 'Use VMTOFFSET in asm code';
vmtDispatch          = -40 deprecated 'Use VMTOFFSET in asm code';
vmtDefaultHandler    = -32 deprecated 'Use VMTOFFSET in asm code';
vmtNewInstance       = -24 deprecated 'Use VMTOFFSET in asm code';
vmtFreeInstance      = -16 deprecated 'Use VMTOFFSET in asm code';
vmtDestroy           =  -8 deprecated 'Use VMTOFFSET in asm code';

vmtQueryInterface    =  0 deprecated 'Use VMTOFFSET in asm code';
vmtAddRef            =  8 deprecated 'Use VMTOFFSET in asm code';
vmtRelease           = 16 deprecated 'Use VMTOFFSET in asm code';
vmtCreateObject      = 24 deprecated 'Use VMTOFFSET in asm code';
{$ELSE !CPUX64}
vmtSelfPtr           = -88;
vmtIntfTable         = -84;
vmtAutoTable         = -80;
vmtInitTable         = -76;
vmtTypeInfo          = -72;
vmtFieldTable        = -68;
vmtMethodTable       = -64;
vmtDynamicTable      = -60;
vmtClassName         = -56;
vmtInstanceSize      = -52;
vmtParent            = -48;
vmtEquals            = -44 deprecated 'Use VMTOFFSET in asm code';
vmtGetHashCode       = -40 deprecated 'Use VMTOFFSET in asm code';
vmtToString          = -36 deprecated 'Use VMTOFFSET in asm code';
vmtSafeCallException = -32 deprecated 'Use VMTOFFSET in asm code';
vmtAfterConstruction = -28 deprecated 'Use VMTOFFSET in asm code';
vmtBeforeDestruction = -24 deprecated 'Use VMTOFFSET in asm code';
vmtDispatch          = -20 deprecated 'Use VMTOFFSET in asm code';
vmtDefaultHandler    = -16 deprecated 'Use VMTOFFSET in asm code';
vmtNewInstance       = -12 deprecated 'Use VMTOFFSET in asm code';
vmtFreeInstance      = -8 deprecated 'Use VMTOFFSET in asm code';
vmtDestroy           = -4 deprecated 'Use VMTOFFSET in asm code';

vmtQueryInterface    = 0 deprecated 'Use VMTOFFSET in asm code';
vmtAddRef            = 4 deprecated 'Use VMTOFFSET in asm code';
vmtRelease           = 8 deprecated 'Use VMTOFFSET in asm code';
vmtCreateObject      = 12 deprecated 'Use VMTOFFSET in asm code';
{$IFEND !CPUX64}

I also noted that the exception code for Delphi x64 was the same as x86:

cDelphiException   = $0EEDFADE;

Given this, I put together a test x64 application in Delphi that would throw an exception, and loaded it into WinDBG.  I enabled the event filter for unknown exceptions, and triggered an exception in the test application.  This broke into WinDBG, where I was able to take a look at the raw stack:

(2ad4.2948): Unknown exception - code 0eedfade (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
KERNELBASE!RaiseException+0x39:
000007fe`fd6ccacd 4881c4c8000000  add     rsp,0C8h
0:000> dd rbp
00000000`0012eab0  00000008 00000000 00000021 00000000
00000000`0012eac0  0059e1f0 00000000 0059e1f0 00000000
00000000`0012ead0  0eedfade 00000001 00000000 00000000
00000000`0012eae0  0059e1dd 00000000 00000007 00000000
00000000`0012eaf0  0059e1dd 00000000 0256cff0 00000000
00000000`0012eb00  00000000 00000000 00000000 00000000
00000000`0012eb10  00000000 00000000 00000000 00000000
00000000`0012eb20  00000000 00000000 0256cff8 00000000

We can see at rbp+20 is the familiar looking 0EEDFADE value.  This is the start of the EXCEPTION_RECORD structure, which I’ve reproduced below from Delphi’s System.pas with a little annotation of my own:

  TExceptionRecord = record
    ExceptionCode: Cardinal;                 // +0
    ExceptionFlags: Cardinal;                // +4
    ExceptionRecord: PExceptionRecord;       // +8
    ExceptionAddress: Pointer;               // +10
    NumberParameters: Cardinal;              // +18
    case {IsOsException:} Boolean of
      True:  (ExceptionInformation : array [0..14] of NativeUInt);
      False: (ExceptAddr: Pointer;           // +20
              ExceptObject: Pointer);        // +28
  end;

We do have to watch out for member alignment with this structure — because it contains both 4 byte DWORDs and 8 byte pointers, there are 4 bytes of hidden padding after the NumberParameters member, as shown below (this is from MSDN, sorry to switch languages on you!):

typedef struct _EXCEPTION_RECORD64 {
  DWORD ExceptionCode;
  DWORD ExceptionFlags;
  DWORD64 ExceptionRecord;
  DWORD64 ExceptionAddress;
  DWORD NumberParameters;
  DWORD __unusedAlignment;
  DWORD64 ExceptionInformation[EXCEPTION_MAXIMUM_PARAMETERS];
} EXCEPTION_RECORD64, *PEXCEPTION_RECORD64;

But what we can see from TExceptionRecord is that at offset 0x28 in the record is a pointer to our ExceptObject.  Great!  That’s everything we need.  We can now put together our x64-modified event filter.

You may remember the x86 event filter:

sxe -c "da poi(poi(poi(ebp+1c))-38)+1 L16;du /c 100 poi(poi(ebp+1c)+4)" 0EEDFADE

So here is the x64 version:

sxe -c "da poi(poi(poi(rbp+48))-70)+1 L16;du /c 100 poi(poi(rbp+48)+8)" 0EEDFADE

And with this filter installed, here is how a Delphi exception is now displayed in WinDBG:

(2ad4.2948): Unknown exception - code 0eedfade (first chance)
00000000`0059e0cf  "MyException"
00000000`02573910  "My very own kind of error message"
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
KERNELBASE!RaiseException+0x39:
000007fe`fd6ccacd 4881c4c8000000  add     rsp,0C8h

I’ll dissect the pointer offsets a little more than I did in the previous blog, because they can be a bit confusing:

  • rbp+48 is a pointer to the exception object (usually a type that inherits from Exception).
  • poi(rbp+48) dereferences that, and at offset 0 right here, we have a pointer to the class type.
  • Before we look at the class type, poi(rbp+48)+8 is the first member of the object (don’t forget ancestor classes), which happens to be FMessage from the Exception class. That gives us our message.
  • Diving deeper, poi(poi(rbp+48)) is now looking at the class type.
  • And we know that the offset of vmtClassName is -112 (-0x70).  So poi(poi(poi(rbp+48))-70) gives us the the ShortString class name, of which the first byte is the length.
  • So we finish with poi(poi(poi(rbp+48))-70)+1, which lets us look at the string itself.

You will see that to access the exception message, I have opted to look directly at the Exception object rather than use the more direct pointer which is on the stack.  I did this to make it easier to see how it might be possible to pull out other members of descendent exception classes, such as ErrorCode from EOSError.

And one final note: looking back on that previous blog, I see that one thing I wrote was a little misleading: the string length of FMessage is indeed available at poi(poi(rbp+48)+8)-4, but the string is null-terminated, so we don’t need to use it — WinDBG understands null-terminated strings.  Where this is more of a problem is with the ShortString type, which is not null-terminated.  This is why sometimes exception class names displayed using this method will show a few garbage characters afterwards, because we don’t bother about accounting for that; the L16 parameter prevents us dumping memory until we reach a null byte.