Category Archives: Development

IE11, Windbg, JavaScript = joy

I was trying to debug a web application today, which is running in an Internet Explorer MSHTML window embedded in a thick client application. Weirdly, the app would fail, silently, without any script errors, on the sixth refresh — pressing F5 six times in a row would crash it each and every time.  Ctrl+F5 or F5, either would do it.

I was going slightly mad trying to trace this, with no obvious solutions. Finally I loaded the process up in Windbg, not expecting much, and found that three (handled) C++ exceptions were thrown for each of the first 6 loads (initial load, + 5 refreshes):

(958.17ac): C++ EH exception - code e06d7363 (first chance)
(958.17ac): C++ EH exception - code e06d7363 (first chance)
(958.17ac): C++ EH exception - code e06d7363 (first chance)

However, on the sixth refresh, this exception was happening four times:

(958.17ac): C++ EH exception - code e06d7363 (first chance)
(958.17ac): C++ EH exception - code e06d7363 (first chance)
(958.17ac): C++ EH exception - code e06d7363 (first chance)
(958.17ac): C++ EH exception - code e06d7363 (first chance)

This gave me something to look at! At the very least there was an observable difference between the refreshes within the debugger. A little more spelunking revealed that the very last exception thrown was the relevant one, and it returned the following trace (snipped):

(958.17ac): C++ EH exception - code e06d7363 (first chance)
ChildEBP RetAddr  
001875e8 75d3359c KERNELBASE!RaiseException+0x58
00187620 72ba1df2 msvcrt!_CxxThrowException+0x48
00187664 72d659cb jscript9!Js::JavascriptExceptionOperators::ThrowExceptionObjectInternal+0xb4
0018767c 72ba1cda jscript9!Js::JavascriptExceptionOperators::ThrowExceptionObject+0x15
001876a8 72bb5175 jscript9!Js::JavascriptExceptionOperators::Throw+0x6e
001876f4 6e9acd03 jscript9!CJavascriptOperations::ThrowException+0x9c
0018771c 6f3029f5 MSHTML!CFastDOM::ThrowDOMError+0x70
00187750 72b68e9d MSHTML!CFastDOM::CWebSocket::DefaultEntryPoint+0xe2
001877b8 72cff6a2 jscript9!Js::JavascriptExternalFunction::ExternalFunctionThunk+0x165
00187810 72bb849c jscript9!Js::JavascriptFunction::CallAsConstructor+0xc7
00187830 72bb8537 jscript9!Js::InterpreterStackFrame::NewScObject_Helper+0x39
0018784c 72bb858c jscript9!Js::InterpreterStackFrame::ProfiledNewScObject_Helper+0x53
0018786c 72bb855e jscript9!Js::InterpreterStackFrame::OP_NewScObject_Impl<Js::OpLayoutCallI_OneByte,1>+0x24
00187ba8 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x3c44
00187dbc 132f1ae1 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
WARNING: Frame IP not in any known module. Following frames may be wrong.
00187dc8 72c372dd 0x132f1ae1
00187ed0 138be456 jscript9!Js::JavascriptFunction::EntryApply+0x265
00187f38 72b6669e 0x138be456
00188288 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
001883bc 132f0681 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
001883c8 72b6669e 0x132f0681
00188718 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018887c 132f1b41 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
00188888 72b6669e 0x132f1b41
00188bd8 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
00188d1c 132f1c21 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
00188d28 72bb7642 0x132f1c21
00189080 72bd501f jscript9!Js::InterpreterStackFrame::Process+0x2175
001890b8 72bd507e jscript9!Js::InterpreterStackFrame::OP_TryCatch+0x49
001893f8 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x4e84
00189594 132f0081 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
00189624 72d0134c 0x132f0081
00189650 72bf3de5 jscript9!Js::JavascriptOperators::GetProperty_Internal<0>+0x48
00189678 72b677a7 jscript9!Js::InterpreterStackFrame::OP_ProfiledLoopBodyStart<0,1>+0xa2
001899b8 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x24f7
00189b34 132f0089 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
00189b40 72b6669e 0x132f0089
00189e98 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018a004 132f0099 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018a010 72b6669e 0x132f0099
0018a368 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018a4bc 132f1c69 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018a4c8 72b6669e 0x132f1c69
0018a818 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018a95c 132f1c71 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018a968 72b6669e 0x132f1c71
0018acb8 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018ade4 132f1fb1 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018adf0 72c372dd 0x132f1fb1
0018ae90 72b60685 jscript9!Js::JavascriptFunction::EntryApply+0x265
0018aee0 72bd4fcc jscript9!Js::JavascriptFunction::CallFunction<1>+0x88
0018b220 72bd501f jscript9!Js::InterpreterStackFrame::Process+0x442e
0018b258 72bd507e jscript9!Js::InterpreterStackFrame::OP_TryCatch+0x49
0018b598 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x4e84
0018b71c 132f1ed9 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018b728 72b6669e 0x132f1ed9
0018ba78 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018bbac 132f1ca1 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018bbb8 72bb7642 0x132f1ca1
0018bf00 72bd501f jscript9!Js::InterpreterStackFrame::Process+0x2175
0018bf38 72bd507e jscript9!Js::InterpreterStackFrame::OP_TryCatch+0x49
0018c278 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x4e84
0018c3ac 132f1e21 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018c3b8 72bb7642 0x132f1e21
0018c700 72bd501f jscript9!Js::InterpreterStackFrame::Process+0x2175
0018c738 72bd507e jscript9!Js::InterpreterStackFrame::OP_TryCatch+0x49
0018ca78 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x4e84
0018cbb4 132f1e21 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018cbc0 72b6669e 0x132f1e21
0018cf08 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018d03c 132f1e51 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018d048 72b6669e 0x132f1e51
0018d398 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018d4bc 132f0341 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018d4c8 72bb7642 0x132f0341
0018d810 72bd501f jscript9!Js::InterpreterStackFrame::Process+0x2175
0018d848 72bd507e jscript9!Js::InterpreterStackFrame::OP_TryCatch+0x49
0018db88 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x4e84
0018dd44 132f1f99 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018dd50 72b60685 0x132f1f99
0018dd98 72bd4fcc jscript9!Js::JavascriptFunction::CallFunction<1>+0x88
0018e0d8 72bd501f jscript9!Js::InterpreterStackFrame::Process+0x442e
0018e110 72bd507e jscript9!Js::InterpreterStackFrame::OP_TryCatch+0x49
0018e450 72c5bfce jscript9!Js::InterpreterStackFrame::Process+0x4e84
0018e488 72c5bf0f jscript9!Js::InterpreterStackFrame::OP_TryFinally+0xb1
0018e7c8 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x68eb
0018e8f4 132f0351 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018e900 72b6669e 0x132f0351
0018ec48 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018ed94 132f1cf9 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018eda0 72b66ce9 0x132f1cf9
0018f0f8 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x1cd7
0018f264 132f1d01 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018f270 72b66ce9 0x132f1d01
0018f5c8 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x1cd7
0018f70c 132f1d49 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018f718 72b60685 0x132f1d49
0018f760 72b6100e jscript9!Js::JavascriptFunction::CallFunction<1>+0x88
0018f7cc 72b60f60 jscript9!Js::JavascriptFunction::CallRootFunction+0x93
0018f814 72b60ee7 jscript9!ScriptSite::CallRootFunction+0x42
0018f83c 72b6993c jscript9!ScriptSite::Execute+0x6c
0018f898 72b69878 jscript9!ScriptEngineBase::ExecuteInternal<0>+0xbb
0018f8b0 6eaab78f jscript9!ScriptEngineBase::Execute+0x1c
0018f964 6eaab67c MSHTML!CListenerDispatch::InvokeVar+0x102
0018f990 6eaab21e MSHTML!CListenerDispatch::Invoke+0x61
0018fa28 6eaab385 MSHTML!CEventMgr::_InvokeListeners+0x1a2
0018fb98 6e8a98bb MSHTML!CEventMgr::Dispatch+0x35a
0018fbc0 6e92d0c0 MSHTML!CEventMgr::DispatchEvent+0x8c
0018fd18 6e92da30 MSHTML!CXMLHttpRequest::Fire_onreadystatechange+0x8b
0018fd2c 6e9343ea MSHTML!CXMLHttpRequest::DeferredFire_onreadystatechange+0x52
0018fd5c 6e8a9472 MSHTML!CXMLHttpRequest::DeferredFire_progressEvent+0x10
0018fdac 773f62fa MSHTML!GlobalWndProc+0x1cd
0018fdd8 773f6d3a USER32!InternalCallWinProc+0x23
0018fe50 773f77c4 USER32!UserCallWinProcCheckWow+0x109
0018feb0 773f788a USER32!DispatchMessageWorker+0x3bc

So now I knew that somewhere deep in my Javascript code there was something happening that was bad. Okay… I guess that helps, maybe?

But then, after some more googling, I discovered the following blog post, published a mere 9 days ago: Finally… JavaScript source line info in a dump. Magic! (In fact, I’m just writing this blog post for my future self, so I remember where to find this info in the future.)

After following the specific incantations outlined within that post, I was able to get the exact line of Javascript that was causing my weird error. All of a sudden, things made sense.

0:000> k
ChildEBP RetAddr  
00186520 76bd22b4 KERNELBASE!RaiseException+0x48
00186558 5900775d msvcrt!_CxxThrowException+0x59
00186588 590076a3 jscript9!Js::JavascriptExceptionOperators::ThrowExceptionObjectInternal+0xb7
001865b8 5901503d jscript9!Js::JavascriptExceptionOperators::Throw+0x77
00186604 6365e79b jscript9!CJavascriptOperations::ThrowException+0x9c
0018662c 63e609aa mshtml!CFastDOM::ThrowDOMError+0x70
00186660 58f1f9a3 mshtml!CFastDOM::CWebSocket::DefaultEntryPoint+0xe2
001866c8 58f29994 jscript9!Js::JavascriptExternalFunction::ExternalFunctionThunk+0x165
00186724 58f2988c jscript9!Js::JavascriptFunction::CallAsConstructor+0xc7
00186744 58f29a4a jscript9!Js::InterpreterStackFrame::NewScObject_Helper+0x39
00186760 58f29a7a jscript9!Js::InterpreterStackFrame::ProfiledNewScObject_Helper+0x53
00186780 58f29aa9 jscript9!Js::InterpreterStackFrame::OP_NewScObject_Impl<Js::OpLayoutCallI_OneByte,1>+0x24
00186b58 58f26510 jscript9!Js::InterpreterStackFrame::Process+0x402b
00186d6c 14ea1ae1 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
WARNING: Frame IP not in any known module. Following frames may be wrong.
00186d78 58faa407 js!Anonymous function [http://marcdev2010:4131/app/main/main-controller.js @ 251,7]
00186e70 15511455 jscript9!Js::JavascriptFunction::EntryApply+0x267
00186ed8 58f268e6 js!d [http://marcdev2010:4131/lib/angular/angular.min.js @ 34,479]
001872c8 58f26510 jscript9!Js::InterpreterStackFrame::Process+0x899
001873f4 14ea0681 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
00187400 58f268e6 js!instantiate [http://marcdev2010:4131/lib/angular/angular.min.js @ 35,101]
001877e8 58f26510 jscript9!Js::InterpreterStackFrame::Process+0x899
00187944 14ea1b41 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
00187950 58f268e6 js!Anonymous function [http://marcdev2010:4131/lib/angular/angular.min.js @ 67,280]
00187d38 58f26510 jscript9!Js::InterpreterStackFrame::Process+0x899
00187e74 14ea1c21 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
...

When I looked at the line in question, line 251 of main-controller.js, I found:

// TODO: We need to handle the web socket connection going down -- it really should be wrappered
//  try {
      $scope.watcher = new WebSocket("ws://"+location.host+"/");
//  } catch(e) {
//    $scope.watcher = null;
//  }
    
    $scope.$on('$destroy', function() {
      if($scope.watcher) {
        $scope.watcher.close();
        $scope.watcher = null;
      }
    });

It turns out that if I press F5, or call browser.navigate from the container app, the $destroy event was never called, and what’s worse, MSHTML wasn’t shutting down the web socket, either.  And by default, Internet Explorer has a maximum of six concurrent WebSocket connections. So six page loads and we’re done. The connection error is not triggering a script error in the MSHTML host, sadly, which is why it was so hard to track down.

Yeah, if I had already written the error handling for the WebSocket connection in the JavaScript code, this issue wouldn’t have been so hard to trace. But at least I’ve learned a useful new debugging technique!

Speculation and further investigation:

  • I’m still working on why the web socket is staying open between page loads when embedded.  It’s not straightforward, and while onbeforeunload is a workaround that, well, works, I hate workarounds when I don’t have a clear answer as to why they are needed.
  • It is possible that there is a circular reference leak or similar, but I did kinda think MS had sorted most of those with recent releases of IE.
  • This problem is only reproducible within the MSHTML embedded browser, and not within Internet Explorer itself. Changing $scope.watcher to window.watcher made no difference.
  • TWebBrowser (Delphi) and TEmbeddedWB (Delphi) both exhibited the same symptoms.
  • A far simpler document with a web socket call did not cause the problem.

Updated a few minutes later:

Yes, I have found the cause. It’s a classic Internet Explorer leak: a function within a closure that references the DOM. Simplified, it looks something like:

  window.addEventListener( "load", function(event) {
    var aa = document.getElementById('aa');
    var watcher = new WebSocket("ws://"+location.host+"/");
    watcher.onmessage = function() {
      aa.innerHTML = 'message received';
    }
    aa.innerHTML = 'WebSocket started';
  }, false);

This will persist the connection between  page loads, and quickly eat up your available WebSocket connections.  But it still only happens in the embedded web browser. Why that is, I am still exploring.

Updated 30 Nov 2015:

The reason this happens only in the embedded web browser is that by default, the embedded web browser runs in an emulation mode for IE7. Even with the X-UA-Compatible meta tag, you still need to add your executable to the FEATURE_BROWSER_EMULATION registry key.

Extending $resource in AngularJS

I’ve recently dived into the brave new world (for me) of AngularJS, for a development project for a client. I always enjoy learning new tools and frameworks, especially when there are good design principles and practices that I can apply to both the new project and filter back into existing code.

In this project, we have an existing backend that is delivering data through a RESTful JSON interface. And this is what $resource was designed for. The front end is a HTML document embedded in an existing thick-client application window. Yes, this is the real world.

The data returned by $resource can be either a single item, or an array of items — a collection. $resource automatically wraps each item in the array with the “class” of the single item, which is nice. This makes it trivial to extend items with helper functions, such as, in my case, a time conversion function for a specific field in the JSON data (pseudocode):

angular.module('appServices').factory('Widget', ['$resource',
  function($resource) {
    var Widget= $resource('/data/widgets/:widgetId.json', {}, {
      query: {method:'GET', params:{widgetId:''}, isArray:true}
    });

    Widget.prototype.createTimeInMinutes = function() {
      var m = moment(this.createDateTime);
      return m.hours()*60 + m.minutes();
    };
    
...

However, finding a way to extend the collection was also of interest to me. For example, to add an itemById function which would return a single item from the array identified by a unique identifier field. This is of course me applying my existing object-oriented brain to a Angular (FWIW, this post was the best intro to Angular that I have found, even though it’s about coming from jQuery and not from an OO world).

It seemed nice to me to be able to write something like collection.itemById(), or item.createTimeInMinutes(), associating these functions with the data that they manipulate. Object orientation doing what it does best.  While I was aware of advice around the dangers of extending built-in object prototypes — monkey-patching, I really wasn’t sure that the same concerns applied to extending an ‘instance’ of Array.

There were several answers on Stack Overflow that related to this, in some way, and helped me think through the problem. I (and others) came up with several possible solutions, none of which were completely beautiful to me:

  1. Extend the array returned from $resource.  This is actually hard to do, but in theory possible with transformResponse. Unfortunately, because AngularJS does not preserve extensions to Array objects, you lose those extensions very easily. I won’t add the code here because it is ultimately unhelpful.
  2. Wrap the array in a helper object, when loading in the controller:
    Resource.query().$promise.then(function(collection) {
      $scope.collection = new CollectionWrapper(collection);
    });

    This worked, again, but added a layer of muck to every collection which was unpalatable to me, and pushed implementation into the controller, which just felt like the wrong plce.

  3. Add a helper object:
    var CollectionHelper = {
      itemById = function(collection, id) {
        ...
      }
    };
    
    ...
    
    var item = CollectionHelper.itemById(collection, id);

    Again, this didn’t feel clean, or quite right, although it worked well enough.

  4. Finally, James suggested using a filter.
    angular.module('myapp').filter('byId', function() {
        return function(collection, id) {
          ...
        }
      });
    
    ...
    
    var item = $filter('byId')(collection, id);
    // or you can go directly if injected:
    var item = byIdFilter(collection,id);
    // and within the template you can use:
    {{collection | byId:id }}
    

This last is certainly the most Angular way of doing it.  I’m still not 100% satisfied, because filters have global scope, which means that we need to give them ugly names like collectionDoWonk, instead of just doWonk.

Is this the best way to skin this cat?

iOS 8 beta 1 — first bug reports

Like every other iOS developer, I have already downloaded and installed XCode 6 and the first beta 8.0 of iOS onto one of my test iDevices. And, like every other IOS developer, I immediately went to go and test one of my apps on the new build. And, unfortunately, as can be expected with a beta, I found a bug. I have dutifully filed a bug report via Apple’s bugreport.apple.com!

Given that bug reports are private, I have opted to make information public here because I have had many, many of my product users ask me about it: the bug first arose with iOS 7.1 and I had hoped that it had been addressed in 8.0. Most of my users are not technical enough to be able to navigate the bugreport.apple.com interface, so their only recourse is to complain to us!

Bug #1: Custom font profiles fail to register and work correctly after device restart

We have developed a number of custom font profiles for various languages, following the documentation on creating font profiles for iOS 7+ at https://developer.apple.com/library/ios/featuredarticles/iPhoneConfigurationProfileRef/iPhoneConfigurationProfileRef.pdf. Each of these profiles exhibits the same problem: after the font profile is installed, the specific language text usually displays in all apps, including Notes, Mail and more. However, as soon as the device is restarted, the font fails to display in any apps. In some cases, residual display of the font continues after the restart, but any edit to the text causes the display to revert to .notdef glyphs or similar.

Amharic text -- square boxes
Amharic text before the font profile is installed — square boxes
Amharic-Text-Notes-Success
Amharic text after the font profile is installed: now readable.  But not for long.

Even before the device is restarted, font display is sometimes inconsistent. For example, if you shutdown mail and restart it, fonts will sometimes display correctly and sometimes incorrectly.

The samples given are using the language Amharic.  The font profile can be installed through my Keyman app, available at http://keyman.com/iphone.

A sample of text in Amharic is ጤና ይስጥልን (U+1324 U+1293 U+0020 U+12ED U+1235 U+1325 U+120D U+1295).  This text displays correctly when the font profile is first installed, in some situations, and always displayed correctly in iOS 7.0.   The issue first arose in iOS 7.1 and has continued into the iOS 8.0 beta.

References:

Bug #2: Touches on fixed elements in Safari are offset vertically

In Safari in iOS 8.0 beta 1, I have found that touching fixed elements often results in a touch which is 200-odd pixels north of the actual location I touch.  No doubt plenty of people will report this one!

Using Delphi attributes to unify source, test and documentation

Updated 28 May 2014: Removed extraneous unit reference from AttributeValidation.pas. Sorry…

What problem was I trying to solve?

Recently, while using the Indy Internet components in Delphi XE2, I was struggling to track the thread contexts in which certain code paths ran, to ensure that resource contention and deadlocks were correctly catered for.

Indy components are reasonably robust, but use a multithreaded model which it turns out is difficult to get 100% correct.  Component callbacks can occur on many different threads:

  • The thread that constructed the component
  • The VCL thread
  • The server listener thread
  • The connection’s thread
  • Some, e.g. exceptions, can occur on any thread

Disentangling this, especially when in conjunction with third party solutions that are based on Indy and may add several layers of indirection, quickly becomes an unenjoyable task.

I started adding thread validation assertions to each function to ensure that I was (a) understanding which thread context the function was actually running in, and (b) to ensure that I didn’t call the function in the wrong context myself.  However, when browsing the code, it was still very difficult to get a big picture view of thread usage.

Introducing attributes

Enter attributes.  Delphi 2010 introduced support for attributes in Win32, and a nice API to query them with extended Run Time Type Information (RTTI).  This is nice, except for one thing: it’s difficult at runtime to find the RTTI associated with the current method.

In this unit, I have tried to tick a number of boxes:

  • Create a simple framework for extending runtime testing of classes with attributes
  • Use attributes to annotate methods, in this case about thread safety, to optimise self-documentation
  • Keep a single, consistent function call in each member method, to test any attributes associated with that method.
  • Sensible preprocessor use to enable and disable both the testing and full RTTI in one place.

One gotcha is that by default, RTTI for Delphi methods is only available for public and published member methods.  This can be changed with the $RTTI compiler directive but you have to remember to do it in each unit!  I have used a unit-based $I include in order to push the correct RTTI settings consistently.

I’ve made use of Delphi’s class helper model to give direct access to any object at compile time.  This is a clean way of injecting this support into all classes which are touched by the RTTI, but does create larger executables.  I believe this to be a worthwhile tradeoff.

Example code

The code sample below demonstrates how to use the attribute tests in a multi-threaded context. In this example, an assertion will be raised soon after cmdDoSomeHardWorkClick is called. Why is this? It happens because the HardWorkCallback function on the main thread is annotated with [MainThread] attribute, but it will be called from TSomeThread‘s thread context, not the main thread.

In order for the program run without an assertion, you could change the annotation of HardWorkCallback to [NotMainThread]. Making this serves as an immediate prompt that you should not be accessing VCL properties, because you are no longer running on the main thread. In fact, unless you can prove that the lifetime of the form will exceed that of TSomeThread, you shouldn’t even be referring to the form. The HardWorkCallback function here violates these principles by referring to the Handle property of TForm. However, because we can show that the form is destroyed after the thread exits, it’s safe to make the callback to the TAttrValForm object itself.

You can download the full source for this project from the link at the bottom of this post in order to compile it and run it yourself.

Exercise: How could you restructure this to make HardWorkCallback thread-safe? There’s more than one way to skin this cat.

unit AttrValSample;

interface

uses
  System.Classes,
  System.SyncObjs,
  System.SysUtils,
  System.Variants,
  Vcl.Controls,
  Vcl.Dialogs,
  Vcl.Forms,
  Vcl.Graphics,
  Vcl.StdCtrls,
  Winapi.Messages,
  Winapi.Windows,

  {$I AttributeValidation.inc};

type
  TSomeThread = class;

  TAttrValForm = class(TForm)
    cmdStartThread: TButton;
    cmdDoSomeHardWork: TButton;
    cmdStopThread: TButton;
    procedure cmdStartThreadClick(Sender: TObject);
    procedure FormDestroy(Sender: TObject);
    procedure cmdStopThreadClick(Sender: TObject);
    procedure cmdDoSomeHardWorkClick(Sender: TObject);
  private
    FThread: TSomeThread;
  public
    [MainThread] procedure HardWorkCallback;
  end;

  TSomeThread = class(TThread)
  private
    FOwner: TAttrValForm;
    FEvent: TEvent;
    [NotMainThread] procedure HardWork;
  protected
    [NotMainThread] procedure Execute; override;
  public
    [MainThread] constructor Create(AOwner: TAttrValForm);
    [MainThread] destructor Destroy; override;
    [MainThread] procedure DoSomeHardWork;
  end;

var
  AttrValForm: TAttrValForm;

implementation

{$R *.dfm}

procedure TAttrValForm.cmdStartThreadClick(Sender: TObject);
begin
  FThread := TSomeThread.Create(Self);

  cmdDoSomeHardWork.Enabled := True;
  cmdStopThread.Enabled := True;
  cmdStartThread.Enabled := False;
end;

procedure TAttrValForm.cmdDoSomeHardWorkClick(Sender: TObject);
begin
  FThread.DoSomeHardWork;
end;

procedure TAttrValForm.cmdStopThreadClick(Sender: TObject);
begin
  FreeAndNil(FThread);
  cmdDoSomeHardWork.Enabled := False;
  cmdStopThread.Enabled := False;
  cmdStartThread.Enabled := True;
end;

procedure TAttrValForm.FormDestroy(Sender: TObject);
begin
  FreeAndNil(FThread);
end;

procedure TAttrValForm.HardWorkCallback;
begin
  ValidateAttributes;
  SetWindowText(Handle, 'Hard work done');
end;

{ TSomeThread }

constructor TSomeThread.Create(AOwner: TAttrValForm);
begin
  ValidateAttributes;
  FEvent := TEvent.Create(nil, False, False, '');
  FOwner := AOwner;
  inherited Create(False);
end;

destructor TSomeThread.Destroy;
begin
  ValidateAttributes;
  if not Terminated then
  begin
    Terminate;
    FEvent.SetEvent;
    WaitFor;
  end;
  FreeAndNil(FEvent);
  inherited Destroy;
end;

procedure TSomeThread.DoSomeHardWork;
begin
  ValidateAttributes;
  FEvent.SetEvent;
end;

procedure TSomeThread.Execute;
begin
  ValidateAttributes;
  while not Terminated do
  begin
    if FEvent.WaitFor = wrSignaled then
      if not Terminated then
        HardWork;
  end;
end;

procedure TSomeThread.HardWork;
begin
  ValidateAttributes;
  FOwner.HardWorkCallback;
end;

end.

The AttributeValidation.inc file referenced in the uses clause above controls RTTI and debug settings, in one line. This pattern makes it easy to use the unit without forgetting to set the appropriate RTTI flags in one unit.

// Disable the following $DEFINE to remove all validation from the project
// You may want to do this with {$IFDEF DEBUG} ... {$ENDIF}
{$DEFINE ATTRIBUTE_DEBUG}

// Shouldn't need to touch anything below here
{$IFDEF ATTRIBUTE_DEBUG}
{$RTTI EXPLICIT METHODS([vcPrivate,vcProtected,vcPublic,vcPublished])}
{$ENDIF}

// This .inc file is also included from AttributeValidation.pas, so
// don't use it again in that context.
{$IFNDEF ATTRIBUTE_DEBUG_UNIT}
AttributeValidation
{$ENDIF}

Finally, the AttributeValidation.pas file itself contains the assembly stub to capture the return address for the caller, and the search through the RTTI for the appropriate method to test in each case. This will have a performance cost so should really only be present in Debug builds.

unit AttributeValidation;

interface

{$DEFINE ATTRIBUTE_DEBUG_UNIT}
{$I AttributeValidation.inc}

uses
  System.Rtti;

type
  // Base class for all validation attributes
  ValidationAttribute = class(TCustomAttribute)
    function Execute(Method: TRTTIMethod): Boolean; virtual;
  end;

  // Will log to the debug console whenever a deprecated
  // function is called
  DeprecatedAttribute = class(ValidationAttribute)
    function Execute(Method: TRTTIMethod): Boolean; override;
  end;

  // Base class for all thread-related attributes
  ThreadAttribute = class(ValidationAttribute);

  // This indicates that the procedure can be called from
  // any thread.  No test to pass, just a bare attribute
  ThreadSafeAttribute = class(ThreadAttribute);

  // This indicates that the procedure must only be called
  // in the context of the main thread
  MainThreadAttribute = class(ThreadAttribute)
    function Execute(Method: TRTTIMethod): Boolean; override;
  end;

  // This indicates that the procedure must only be called
  // in another thread context.
  NotMainThreadAttribute = class(ThreadAttribute)
    function Execute(Method: TRTTIMethod): Boolean; override;
  end;

  TAttributeValidation = class helper for TObject
{$IFDEF ATTRIBUTE_DEBUG}
  private
    procedure IntValidateAttributes(FReturnAddress: UIntPtr);
{$ENDIF}
  protected
    procedure ValidateAttributes;
  end;

implementation

uses
  Winapi.Windows,

  classes;

{ TAttributeValidation }

{
 Function:    TAttributeValidation.ValidateAttributes

 Description: Save the return address to an accessible variable
              on the stack.  We could do this with pure Delphi and
              some pointer jiggery-pokery, but this is cleaner.
}
{$IFNDEF ATTRIBUTE_DEBUG}
procedure TAttributeValidation.ValidateAttributes;
begin
end;
{$ELSE}
{$IFDEF CPUX64}
procedure TAttributeValidation.ValidateAttributes;
asm
  push rbp
  sub  rsp, $20
  mov  rbp, rsp
                          // rcx = param 1; will already be pointing to Self.
  mov  rdx, [rbp+$28]     // rdx = param 2; rbp+$28 is return address on stack
  call TAttributeValidation.IntValidateAttributes;

  lea  rsp, [rbp+$20]
  pop  rbp
end;
{$ELSE}
procedure TAttributeValidation.ValidateAttributes;
asm
                            // eax = Self
  mov edx, dword ptr [esp]  // edx = parameter 1
  call TAttributeValidation.IntValidateAttributes
end;
{$ENDIF}

{
 Function:    TAttributeValidation.IntValidateAttributes

 Description: Find the closest function to the return address,
              and test the attributes in that function.  Assumes
              that the closest function is the correct one, so
              if RTTI is missing then you'll be in a spot of
              bother.
}
procedure TAttributeValidation.IntValidateAttributes(FReturnAddress: UIntPtr);
var
  FRttiType: TRttiType;
  FClosestRttiMethod, FRttiMethod: TRTTIMethod;
  FAttribute: TCustomAttribute;
begin
  with TRttiContext.Create do
  try
    FRttiType := GetType(ClassType);
    if not Assigned(FRttiType) then Exit;

    FClosestRttiMethod := nil;

    // Find nearest function for the return address
    for FRttiMethod in FRttiType.GetMethods do
    begin
      if (UIntPtr(FRttiMethod.CodeAddress) <= FReturnAddress) then
      begin
        if not Assigned(FClosestRttiMethod) or
            (UIntPtr(FRttiMethod.CodeAddress) > UIntPtr(FClosestRttiMethod.CodeAddress)) then
          FClosestRttiMethod := FRttiMethod;
      end;
    end;

    // Check attributes for the function
    if Assigned(FClosestRttiMethod) then
    begin
      for FAttribute in FClosestRttiMethod.GetAttributes do
      begin
        if FAttribute is ValidationAttribute then
        begin
          if not (FAttribute as ValidationAttribute).Execute(FClosestRttiMethod) then
          begin
            Assert(False, 'Attribute '+FAttribute.ClassName+' did not validate on '+FClosestRttiMethod.Name);
          end;
        end;
      end;
    end;
  finally
    Free;
  end;
end;
{$ENDIF}

{ ValidationAttribute }

function ValidationAttribute.Execute(Method: TRTTIMethod): Boolean;
begin
  Result := True;
end;

{ MainThreadAttribute }

function MainThreadAttribute.Execute(Method: TRTTIMethod): Boolean;
begin
  Result := GetCurrentThreadID = MainThreadID;
end;

{ NotMainThreadAttribute }

function NotMainThreadAttribute.Execute(Method: TRTTIMethod): Boolean;
begin
  Result := GetCurrentThreadID <> MainThreadID;
end;

{ DeprecatedAttribute }

function DeprecatedAttribute.Execute(Method: TRTTIMethod): Boolean;
begin
  OutputDebugString(PChar(Method.Name + ' was called.'#13#10));
  Result := True;
end;

end.

There you have it — a “real” use case for attributes in Delphi. The key advantages I see to this approach, as opposed to, say function-level assertions, is that a birds-eye view of your class will help you to understand the preconditions for each member function, and these preconditions can be consistently and simply tested.

Using a class helper makes it easy to inject the additional functionality into every class that is touched by attribute validation, without polluting the class hierarchy. This means that attribute tests can be seamlessly added to existing infrastructure and Delphi child classes such as TForm.

Full source: AttrVal.zip. License: MPL 2.0. YMMV and use at your own risk.

The Beautiful City of Software

A new frenzy grips the architects, the builders, the carpenters, the painters. The buildings must be changed, must grow, now, now, today. And so they scurry, nailing on curlicues and raising floors, tearing down this staircase, putting up this ladder, and at the end of the day they step back, look up, shake hands and agree to do it again tomorrow, now, now!

GHA40226In the midst of the twisted roadways runs the river, and across its waters lies a bridge. Call it London Bridge. Not designed. Just happened. And always growing, this way and that way, a feature here, don’t like that one there any more, should bring this railing up to spec, cries the engineer, whilst beside him the others hammer together the new houses that crowd the bridge’s fragile shoulders, and yet again it crumbles, down into the rushing waters, patched even as it falls, and saved at the last moment by the railing that the engineer brought up to spec. But touch not the railing now, lest the whole bridge collapse. Heedlessly, the crowds continue to cross the bridge.

Nestled amongst the towers of this city is a little house. Built by yours truly, it has gables and stands proudly on its own foundations. No one knows how I mixed the concrete, how I discovered for myself the secret formulas of the masons. For now it stands, mirroring the towering edifices surrounding it, calling for its own moment in the light. Crudely, yet lovingly, its facets are shaped, aping the towers’ gleaming edges.

For none can see the bones of those towers now, save in the dreams, nay horrors, of the men who built them. Carefully, the gleaming panels were draped over, and hid the gross deformities beneath a respectable skin. The towers reach skyward, bastions of the city, and all seek to build their own towers in homage to them.

None can see the bones? I speak falsely. There are those who live beneath the surface of the living, creaking city. They crawl inside the hidden and forgotten ways, and learn its secrets, for good, for evil and for love of learning secrets. Some, graspers, take their knowledge, and shake the towers with it, as the owners rush to protect and rebuild, patching the bones with sticking plasters and casts painted in cheerful colours.

No one notices the bones of my little house. Bones no better than those of the towers, if a little smaller.

In the University, I discover how to build a crystal palace, beautiful, fragile and empty, devoid of purpose. Perfect in every way except one. For it has no doors and doors cannot be added. I cannot take the bones from the palace and put them into my house. The crystal bones resile from my rough-hewn timber tresses. They shatter.

I hear the men building in a frenzy and the monster grips me too. I rush from room to room of my house, desperate for change and fame and wealth, shifting this, nailing that, never noticing the damage I wreak until out of breath I stop and look back, just in time, recoiling as I realise how close I came to losing my soul. I run from my house, shaking off the claws of the monster as it howls impotently at me, you’ll get left behind!

Down in the market, I wander from stall to stall. Buy this paint! Use these magic bones: make your house into a tower! Be noticed! My house must be festooned with gargoyles to protect those who enter from the crawlers beneath the city’s skin! The noise is unsettling, the message now bland and tasteless. The graspers watch me walking through, asking themselves if I have anything of worth.

Beyond the market lies the city hall, where the men of import gather. I spin a tale of the beauty of my house, desperate to be noticed, and how strong its bones, how elegant its gables. One man turns and sees me, offers wealth beyond my dreams. But inside my heart I now know he offers only the chance to take my house, my pride, for himself, and tear it apart and spread the best of its blocks amongst his towers. And so I reject him, and again I flee.

But then I find the man in the corner of the market. He has no charms to sell me. Instead he tells me of those who still secretly live in the city, building houses with pride, each more robust and trustworthy than the last, and though sometimes they look toward the gleaming edifices wistfully, yet they themselves were once crawlers beneath the surface, for the love of learning secrets. These men and women are gathering, slowly, he tells me, into a guild. A guild that will protect and honour and create buildings that last, unlike those on the bridge, crumbling and tumbling even now, unlike the towers, gleaming and perfect and rotten to the core.

This time of growth and pain and foolishness must be endured, but it shall pass. The wise men of the University shall join us, he proclaims, and together we shall build with beauty and strength. Gradually the towers shall each fail and fall and be replaced by virtuous buildings of grace, beauty and strength, built with love and care for those who live inside them.

I ask if I may join their guild, and ungrudgingly he bids me welcome, and willingly I set myself to learn.

Generics and Delphi enumerated types without RTTI

Some Delphi types do not have RTTI. This is no fun. This happens when, and I quote:

whereas enumerated constants with a specific value, such as the following, do not have RTTI:
type SomeEnum = (e1 = 1, e2 = 2, e3 = 3);

In normal use, this will go unnoticed, and not cause you any grief, until you throw these enumerated types into a generic construct (or have any other need to use RTTI). As soon as you do that, you’ll start getting the unhelpful and misleading “Invalid Class Typecast” exception. (No it’s not a Class!)

To avoid this problem, you must wander into the dark world of pointer casting, because once you are pointing at some data, Delphi no longer cares what its actual type is.

Here’s an example of how to convert a Variant value into a generic type, including support for RTTI-free enums, in a reasonably type-safe way. This is part of a TNullable record type, which mimics, in some ways, the .NET Nullable type. The workings of this type are not all that important for the example, however. This example works with RTTI types, and with one byte non-RTTI enumerated types &mdash you’d need to extend it to support larger enumerated types. While I could reduce the number of steps in the edge case by spelunking directly into the Variant TVarData, that would not serve to clarify the murk.

constructor TNullable<T>.Create(AValue: Variant);
type
  PT = ^T;
var
  v: Byte;
begin
  if VarIsEmpty(AValue) or VarIsNull(AValue) then
    Clear
  else if (TypeInfo(T) = nil) and
    (SizeOf(T) = 1) and
    (VarType(AValue) = varByte) then
  begin
    { Assuming an enum type without typeinfo, have to
      do some cruel pointer magics here to avoid type
      cast errors, so am very careful to validate
      first! }
    v := AValue;
    FValue := PT(@v)^;
  end
  else
    Create(TValue.FromVariant(AValue).AsType<T>);
end;

So what is going on here? Well, first if we are passed Null or “Empty” variant values, then we just clear our TNullable value.

Otherwise we test if (a) we have no RTTI for our generic, and (b) it’s one byte in size, and (c) our variant is also a Byte value. If all these prerequisites are met, we perform the casting, in which we hark back to the ancient incantations with a pointer typecast, taking the address of the value and dereferencing it, fooling the compiler along the way. (Ha ha!)

Finally, we find a modern TValue incantation suffices to wreak the type change for civilised types such as Integer or String.

Testing for design time in Delphi, in an initialization section

Sometimes it can be handy to test for design-time in a component unit when the component package is first loaded, e.g. within an initialization section, rather than when a component is created or registered. We use this to validate that runtime units that interoperate with a component are linked into a project, and raise an error as early as possible if they are not.

With Delphi’s RTTI, this is fairly straightforward, I believe:

function IsDesignTime: Boolean;
begin
  Result := TRttiContext.Create.FindType('ToolsAPI.IBorlandIDEServices') <> nil;
end;

Is there anything wrong with this?

Finding class instances in a Delphi process using WinDbg

Using WinDbg to debug Delphi processes can be both frustrating and rewarding. Frustrating, because even with the tools available to convert Delphi’s native .TDS symbol file format into .DBG or .PDB, we currently only get partial symbol information. But rewarding when you persist, because even though it may seem obscure and borderline irrational, once you get a handle on the way objects and Run Time Type Information (RTTI) are implemented with Delphi, you can accomplish a lot, quite easily.

For the post today, I’ve created a simple Delphi application which we will investigate in a couple of ways. If you want to follow along, you’ll need to build the application and convert the debug symbols generated by Delphi to .DBG format with map2dbg or tds2dbg. I’ll leave the finer details of that to you — it’s not very complicated. Actually, to save effort, I’ve uploaded both the source, and the debug symbols + dump + executable (24MB zip).

I’ve made reference to a few Delphi internal constants in this post. These are defined in System.pas, and I’m using the constants as defined for Delphi XE2. The values may be different in other versions of Delphi.

In the simple Delphi application, SpelunkSample, I will be debugging a simulated crash. You can choose to either attach WinDbg to the process while it is running, or to create a crash dump file using a tool such as procdump.exe and then working with the dump file. If you do choose to create a dump file, you should capture the full process memory dump, not just stack and thread information (use -ma flag with procdump.exe).

I’ll use procdump.exe. First, I use tds2dbg.exe to convert the symbols into a format that WinDbg groks:

Convert Delphi debug symbols
Convert Delphi debug symbols

Then I just fire up the SpelunkSample process and click the “Do Something” button.
Clicking "Do Something"
Clicking “Do Something”

Next, I use procdump to capture a dump of the process as it stands. This generates a rather large file, given that this is not much more than a “Hello World” application, but don’t stress, we are not going to be reading the whole dump file in hex (only parts of it).
Procdump to give us something to play with
Procdump to give us something to play with

Time to load the dump file up in Windbg.

I want to understand what is going wrong with the process (actually, nothing is going wrong, but bear with me). I figure it’s important to know which forms are currently instantiated. This is conceptually easy enough to do: Delphi provides the TScreen class, which is instantiated as a global singleton accessible via the Screen variable in Vcl.Forms.pas. If we load this up, we can see a member variable FForms: TList, which contains references to all the forms “on the screen”.

TScreen = class(TComponent)
private
  FFonts: TStrings;
  FImes: TStrings;
  FDefaultIme: string;
  FDefaultKbLayout: HKL;
  FPixelsPerInch: Integer;
  FCursor: TCursor;
  FCursorCount: Integer;
  FForms: TList;
  FCustomForms: TList;
  ...

But how to find this object in a 60 megabyte dump file? In fact, there are two good methods: use Delphi’s RTTI and track back; and use the global screen variable and track forward. I’ll examine them both, because they both come in handy in different situations.

Finding objects using Delphi’s RTTI

Using Delphi’s Run Time Type Information (RTTI), we can find the name of the class in memory and then track back from that. This information is in the process image, which is mapped into memory at a specific address (by default, 00400000 for Delphi apps, although you can change this in Linker options). So let’s find out where this is mapped:

0:000> lmv m SpelunkSample
start    end        module name
00400000 00b27000   SpelunkSample   (deferred)             
    Image path: C:\Users\mcdurdin\Documents\SpelunkSample\Win32\Debug\SpelunkSample.exe
    Image name: SpelunkSample.exe
    Timestamp:        Tue Dec 10 09:19:01 2013 (52A641D5)
    CheckSum:         0071B348
    ImageSize:        00727000
    File version:     1.0.0.0
    Product version:  1.0.0.0
    File flags:       0 (Mask 3F)
    File OS:          4 Unknown Win32
    File type:        1.0 App
    File date:        00000000.00000000
    Translations:     0409.04e4
    ProductVersion:   1.0.0.0
    FileVersion:      1.0.0.0

Now we can search this memory for a specific ASCII string, the class name TScreen. When searching through memory, it’s important to be aware that this is just raw memory. So false positives are not uncommon. If you are unlucky, then the data you are searching for could be repeated many times through the dump, making this task virtually impossible. In practice, however, I’ve found that this rarely happens.

With that in mind, let’s do using the s -a command:

0:000> s -a 0400000 00b27000 "TScreen"
004f8f81  54 53 63 72 65 65 6e 36-00 90 5b 50 00 06 43 72  TScreen6..[P..Cr
004f9302  54 53 63 72 65 65 6e e4-8b 4f 00 f8 06 44 00 02  TScreen..O...D..
00a8e926  54 53 63 72 65 65 6e 40-24 62 63 74 72 24 71 71  TScreen@$bctr$qq
00a8ea80  54 53 63 72 65 65 6e 40-24 62 64 74 72 24 71 71  TScreen@$bdtr$qq
00a8ea9f  54 53 63 72 65 65 6e 40-47 65 74 48 65 69 67 68  TScreen@GetHeigh
00a8eac2  54 53 63 72 65 65 6e 40-47 65 74 57 69 64 74 68  TScreen@GetWidth
00a8eae4  54 53 63 72 65 65 6e 40-47 65 74 44 65 73 6b 74  TScreen@GetDeskt
00a8eb0b  54 53 63 72 65 65 6e 40-47 65 74 44 65 73 6b 74  TScreen@GetDeskt
00a8eb33  54 53 63 72 65 65 6e 40-47 65 74 44 65 73 6b 74  TScreen@GetDeskt
00a8eb5d  54 53 63 72 65 65 6e 40-47 65 74 44 65 73 6b 74  TScreen@GetDeskt
00a8eb86  54 53 63 72 65 65 6e 40-47 65 74 4d 6f 6e 69 74  TScreen@GetMonit

00ada300  54 53 63 72 65 65 6e 40-43 6c 65 61 72 4d 6f 6e  TScreen@ClearMon
00ada32b  54 53 63 72 65 65 6e 40-47 65 74 4d 6f 6e 69 74  TScreen@GetMonit
00ada354  54 53 63 72 65 65 6e 40-47 65 74 50 72 69 6d 61  TScreen@GetPrima

Whoa, that’s a lot of data. Looking at the results though, there are two distinct ranges of memory: 004F#### and 00A#####. Those in the 00A##### range are actually Delphi’s native debug symbols, mapped into memory. So I can ignore those. To keep myself sane, and make the debug console easier to review, I’ll rerun the search for a smaller range:

0:000> s -a 0400000 00a80000 "TScreen"
004f8f81  54 53 63 72 65 65 6e 36-00 90 5b 50 00 06 43 72  TScreen6..[P..Cr
004f9302  54 53 63 72 65 65 6e e4-8b 4f 00 f8 06 44 00 02  TScreen..O...D..

Now, these two references are close together, and I will tell you that the first one is the one we want. Generally speaking, the first one is in the class metadata, and the second one is not important today. Now that we have that "TScreen" string found in memory, we need to go back 1 byte. Why? Because "TScreen" is a Delphi ShortString, which is a string up to 255 bytes long, implemented as a length:byte followed by data (ANSI chars). And then we search for a pointer to that memory location with the s -d command:

0:000> s -d 0400000 00a80000 004f8f80
004f8bac  004f8f80 000000bc 0043ff28 00404ff4  ..O.....(.C..O@.

Only one reference, nearby in memory, which is expected — the class metadata is generally stored nearby the class implementation. Now this is where it gets a little brain-bending. This pointer is stored in Delphi’s class metadata, as I said. But most this metadata is actually stored in memory before the class itself. Looking at System.pas, in Delphi XE2 we have the following metadata for x86:

  vmtSelfPtr           = -88;
  vmtIntfTable         = -84;
  vmtAutoTable         = -80;
  vmtInitTable         = -76;
  vmtTypeInfo          = -72;
  vmtFieldTable        = -68;
  vmtMethodTable       = -64;
  vmtDynamicTable      = -60;
  vmtClassName         = -56;
  vmtInstanceSize      = -52;
  vmtParent            = -48;
  vmtEquals            = -44 deprecated 'Use VMTOFFSET in asm code';
  vmtGetHashCode       = -40 deprecated 'Use VMTOFFSET in asm code';
  vmtToString          = -36 deprecated 'Use VMTOFFSET in asm code';
  vmtSafeCallException = -32 deprecated 'Use VMTOFFSET in asm code';
  vmtAfterConstruction = -28 deprecated 'Use VMTOFFSET in asm code';
  vmtBeforeDestruction = -24 deprecated 'Use VMTOFFSET in asm code';
  vmtDispatch          = -20 deprecated 'Use VMTOFFSET in asm code';
  vmtDefaultHandler    = -16 deprecated 'Use VMTOFFSET in asm code';
  vmtNewInstance       = -12 deprecated 'Use VMTOFFSET in asm code';
  vmtFreeInstance      = -8 deprecated 'Use VMTOFFSET in asm code';
  vmtDestroy           = -4 deprecated 'Use VMTOFFSET in asm code';

Ignore that deprecated noise — it’s the constants that we want to know about. So the vmtClassName is at offset -56 (-38 hex). In other words, to find the class itself, we need to look 56 bytes ahead of the address of that pointer that we just found. That is, 004f8bac + 38h = 004f8be4. Now, if I use the dds (display words and symbols) command, we can see pointers to the implementation of each of the class’s member functions:

0:000> dds 004f8bac + 38
004f8be4  00445574 SpelunkSample!System.Classes.TPersistent.AssignTo
004f8be8  004515f8 SpelunkSample!System.Classes.TComponent.DefineProperties
004f8bec  004454a4 SpelunkSample!System.Classes.TPersistent.Assign
004f8bf0  004516f0 SpelunkSample!System.Classes.TComponent.Loaded
004f8bf4  00451598 SpelunkSample!System.Classes.TComponent.Notification
004f8bf8  00451700 SpelunkSample!System.Classes.TComponent.ReadState
004f8bfc  004520ac SpelunkSample!System.Classes.TComponent.CanObserve
004f8c00  004520b0 SpelunkSample!System.Classes.TComponent.ObserverAdded
004f8c04  00451f24 SpelunkSample!System.Classes.TComponent.GetObservers
004f8c08  00451b48 SpelunkSample!System.Classes.TComponent.SetName
004f8c0c  00452194 SpelunkSample!System.Classes.TComponent.UpdateRegistry
004f8c10  00451710 SpelunkSample!System.Classes.TComponent.ValidateRename
004f8c14  00451708 SpelunkSample!System.Classes.TComponent.WriteState
004f8c18  0045219c SpelunkSample!System.Classes.TComponent.QueryInterface
004f8c1c  00505b90 SpelunkSample!Vcl.Forms.TScreen.Create
004f8c20  00452070 SpelunkSample!System.Classes.TComponent.UpdateAction
004f8c24  0000000e
004f8c28  00010000
004f8c2c  12880000
004f8c30  00400040 SpelunkSample+0x40
004f8c34  00000000
004f8c38  00000000
004f8c3c  1800001d
004f8c40  3800439d
004f8c44  06000000
004f8c48  6e6f4646
004f8c4c  00027374
004f8c50  439d1800
004f8c54  00003c00
004f8c58  49460500
004f8c5c  0273656d
004f8c60  12880000

Huh. That’s interesting, but it’s a sidetrack; we can see TScreen.Create which suggests we are looking at the right thing. There’s a whole lot more buried in there but it’s not for this post. Let’s go back to where we were.

How do we take that class address and find instances of the class? I’m sure you can see where we are going. But here’s where things change slightly: we are looking in allocated memory now, not just the process image. So our search has to broaden. Rather than go into the complexities of memory allocation, I’m going to go brute force and look across a much larger range of memory, using the L? search parameter (which allows us to search more than 256MB of data at once):

0:000> s -d 00400000 L?F000000 004f8be4
004f8b8c  004f8be4 00000000 00000000 004f8c24  ..O.........$.O.
0247b370  004f8be4 00000000 00000000 00000000  ..O.............

Only two references. Why two and not one, given that we know that TScreen is a singleton? Well, because Delphi helpfully defines a vmtSelf metadata member, at offset -88 (and if we do the math, we see that 004f8be4 - 004f8b8c = 58h = 88d). So let’s look at the second one. That’s our TScreen instance in memory.

In this case, there was only one instance. But you can sometimes pickup objects that have been freed but where the memory has not been reused. There’s no hard and fast way (that I am aware of) of identifying these cases — but using the second method of finding a Delphi object, described below, can help to differentiate.

I’ll come back to how we use this object memory shortly. But first, here’s another way of getting to the same address.

Finding a Delphi object by variable or reference

As we don’t have full debug symbol information at this time, it can be difficult to find variables in memory. For global variables, however, we know that the location is fixed at compile time, and so we can use the disassembler in WinDbg to locate the address relatively simply. First, look in the source for a reference to the Screen global variable. I’ve found it in the FindGlobalComponent function (ironically, that function is doing programatically what we are doing via the long and labourious manual method):

function FindGlobalComponent(const Name: string): TComponent;
var
  I: Integer;
begin
  for I := 0 to Screen.FormCount - 1 do
  begin
    ...

So, disassemble the first few lines of the function. Depending on the conversion tool you used, the symbol format may vary (x spelunksample!*substring* can help in finding symbols).

0:000> u SpelunkSample!Vcl.Forms.FindGlobalComponent
SpelunkSample!Vcl.Forms.FindGlobalComponent:
004fcda8 53              push    ebx
004fcda9 56              push    esi
004fcdaa 57              push    edi
004fcdab 55              push    ebp
004fcdac 8be8            mov     ebp,eax
004fcdae a100435200      mov     eax,dword ptr [SpelunkSample!Spelunksample.initialization+0xb1ac (00524300)]
004fcdb3 e81c910000      call    SpelunkSample!Vcl.Forms.TScreen.GetFormCount (00505ed4)
004fcdb8 8bf0            mov     esi,eax

The highlighted address there corresponds to the Screen variable. The initialization+0xb1ac rubbish suggests missing symbol information, because (a) it doesn’t make much sense to be pointing to the “initialization” code, and (b) the offset is so large. And in fact, that is the case, we don’t have symbols for global variables at this time (one day).

But because we know this, we also know that 00524300 is the address of the Screen variable. The variable, which is a pointer, not the object itself! But because it’s a pointer, it’s easy to get to what it’s pointing to!

0:000> dd 00524300 L1
00524300  0247b370

Look familiar? Yep, it’s the same address as we found the RTTI way, and somewhat more quickly too. But now on to finding the list of forms!

Examining object members

Let’s dump that TScreen instance out and annotate its members. The symbols below I’ve manually added to the data, by looking at the implementation of TComponent and TScreen. I’ve also deleted some misleading annotations that Windbg added.

0:000> dds poi(00524300)
0247b370  004f8be4 TScreen
0247b374  00000000 TComponent.FOwner
0247b378  00000000 TComponent.FName
0247b37c  00000000 TComponent.FTag
0247b380  00000000 TComponent.FComponents
0247b384  00000000 TComponent.FFreeNotifies
0247b388  00000000 TComponent.FDesignInfo
0247b38c  00000000 TComponent.FComponentState
0247b390  00000000 TComponent.FVCLComObject
0247b394  00000000 TComponent.FObservers
0247b398  00000001 TComponent.FComponentStyle
0247b39c  00000000 TComponent.FSortedComponents
0247b3a0  0043fec8 
0247b3a4  0043fed8 
0247b3a8  00000000 TScreen.FFonts
0247b3ac  024b4e10 TScreen.FImes
0247b3b0  00000000 TScreen.FDefaultIme
0247b3b4  04090c09 TScreen.FDefaultKbLayout
0247b3b8  00000060 TScreen.FPixelsPerInch
0247b3bc  00000000 TScreen.FCursor
0247b3c0  00000000 TScreen.FCursorCount
0247b3c4  02489da8 TScreen.FForms
0247b3c8  02489dc0 ...

How did I map that? It’s not that hard — just look at the class definitions in the Delphi source. You do need to watch out for two things: packing, and padding. x86 processors expect variables to be aligned on a boundary of their size, so a 4 byte DWORD will be aligned on a 4 byte boundary. Conversely, a boolean only takes a byte of memory, and multiple booleans can be packed into a single DWORD. Delphi does not do any ‘intelligent’ reordering of object members (which makes life a lot simpler), so this means we can just map pretty much one-to-one. The TComponent object has the following member variables (TPersistent and TObject don’t have any member variables):

  TComponent = class(TPersistent, IInterface, IInterfaceComponentReference)
  private
    FOwner: TComponent;
    FName: TComponentName;
    FTag: NativeInt;
    FComponents: TList;
    FFreeNotifies: TList;
    FDesignInfo: Longint;
    FComponentState: TComponentState;
    FVCLComObject: Pointer;
    FObservers: TObservers;
    ...
    FComponentStyle: TComponentStyle;
    ...
    FSortedComponents: TList;

And TScreen has the following (we’re only interested in the members up to and including FForms):

  TScreen = class(TComponent)
  private
    FFonts: TStrings;
    FImes: TStrings;
    FDefaultIme: string;
    FDefaultKbLayout: HKL;
    FPixelsPerInch: Integer;
    FCursor: TCursor;
    FCursorCount: Integer;
    FForms: TList;
    ...

Let’s look at 02489da8, the FForms TList object. The first member variable of TList is FList: TPointerList. Knowing what we do about the object structure, we can:

0:000>dd 02489da8 L4
02489da8  004369e8 02482da8 00000001 00000004

It can be helpful to do a sanity check here and make sure that we haven’t gone down the wrong rabbit hole. Let’s check that this is actually a TList (poi deferences a pointer, but you should be able to figure the rest out given the discussion above):

0:000> da poi(004369e8-38)+1
00436b19  "TList'"

And yes, it is a TList, so we haven’t dereferenced the wrong pointer. All too easy to do in the dark cave that is assembly-language debugging. Back to the lead. We can see from the definition of TList:

  TList = class(TObject)
  private
    FList: TPointerList;
    FCount: Integer;
    FCapacity: Integer;
    ...

That we have a pointer to 02482da8 which is our list of form pointers, and a count of 00000001 form. Sounds good. Take a quick peek at that form:

0:000> dd poi(02482da8) L1
02444320  005112b4
0:000> da poi(poi(poi(02482da8))-38)+1
0051148e  "TSpelunkSampleForm."

Yes, it’s our form! But what is with that poi poi poi? Well, I could have dug down each layer one step at a time, but this is a shortcut, in one swell foop dereferencing the variable, first to the object, then dereferencing to the class, then back 38h bytes and dereferencing to the class name, and plus one byte for that ShortString hiccup. Saves time, and once familiar you can turn it into a WinDbg macro. But it’s helpful to be familiar with the structure first!

Your challenge

Your challenge now is to list each of the TMyObject instances currently allocated. I’ve added a little spice: one of them has been freed but some of the data may still be in the dump. So you may find it is not enough to just use RTTI to find the data — recall that the search may find false positives and freed instances. You should find that searching for RTTI and also disassembling functions that refer to member variables in the form are useful. Good luck!

Hint: If you are struggling to find member variable offsets to find the list, the following three lines of code from FormCreate may help (edx ends up pointing to the form instance):

0051168f e87438efff      call    SpelunkSample!System.TObject.Create (00404f08)
00511694 8b55fc          mov     edx,dword ptr [ebp-4]
00511697 898294030000    mov     dword ptr [edx+394h],eax

Delphi’s TJSONString.ToString is broken, and how to fix it

As per several QC reports, Data.DBXJSON.TJSONString.ToString is still very broken. Which means, for all intents and purposes, TJSONAnything.ToString is also broken. Fortunately, you can just use TJSONAnything.ToBytes for a happy JSON outcome.

The following function will take any Delphi JSON object and convert it to a string:

function JSONToString(obj: TJSONAncestor): string;
var
  bytes: TBytes;
  len: Integer;
begin
  SetLength(bytes, obj.EstimatedByteSize);
  len := obj.ToBytes(bytes, 0);
  Result := TEncoding.ANSI.GetString(bytes, 0, len);
end;

Because TJSONString.ToBytes escapes all characters outside U+0020-U+007F, we can assume that the end result is 7-bit clean, so we can use TEncoding.ANSI.  You could instead stream the TBytes to a file or do other groovy things with it.

Debugging a stalled Delphi process with Windbg and memory searches

Today I’ve got a process on my machine that is supposed to be exiting, but it has hung. Let’s load it up in Windbg and find what’s up. The program in question was built in Delphi XE2, and symbols were generated by our internal tds2dbg tool (but there are other tools online which create similar .dbg files). As usual, I am writing this up for my own benefit as much as anyone else’s, but if I put it on my blog, it forces me to put in enough detail that even I can understand it when I come back to it!

Looking at the main thread, we can see unit finalizations are currently being called, but the specific unit finalization section and functions which are being called are not immediately visible in the call stack, between InterlockedCompareExchange and FinalizeUnits:

0:000> kb
ChildEBP RetAddr  Args to Child              
0018ff3c 0040908c 0018ff78 0040909a 0018ff5c audit4_patient!System.Sysutils.InterlockedCompareExchange+0x5 [C:\Program Files (x86)\Embarcadero\RAD Studio\9.0\source\rtl\sys\InterlockedAPIs.inc @ 23]
0018ff5c 004094b2 0018ff88 00000000 00000000 audit4_patient!System.FinalizeUnits+0x40 [C:\Program Files (x86)\Embarcadero\RAD Studio\9.0\source\rtl\sys\System.pas @ 17473]
0018ff88 74b7336a 7efde000 0018ffd4 76f99f72 audit4_patient!System..Halt0+0xa2 [C:\Program Files (x86)\Embarcadero\RAD Studio\9.0\source\rtl\sys\System.pas @ 18599]
0018ff94 76f99f72 7efde000 35648d3c 00000000 kernel32!BaseThreadInitThunk+0xe
0018ffd4 76f99f45 01a5216c 7efde000 00000000 ntdll!__RtlUserThreadStart+0x70
0018ffec 00000000 01a5216c 7efde000 00000000 ntdll!_RtlUserThreadStart+0x1b

So, the simplest way to find out where we were was to step out of the InterlockedCompareExchange call. I found myself in System.SysUtils.DoneMonitorSupport (specifically, the CleanEventList subprocedure):

0:000> p
eax=01a8ee70 ebx=01a8ee70 ecx=01a8ee70 edx=00000001 esi=00000020 edi=01a26e80
eip=0042dcb1 esp=0018ff20 ebp=0018ff3c iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200202
audit4_patient!CleanEventList+0xd:
0042dcb1 33c9            xor     ecx,ecx

After a little more spelunking, and a review of the Delphi source around this function, I found that this was a part of the System.TMonitor support. Specifically, there was a locked TMonitor somewhere that had not been destroyed. I stepped through a loop that was spinning, waiting for the object to be unlocked so its handle could be destroyed, and found a reference to the data in question here:

0:000> p
eax=00000001 ebx=01a8ee70 ecx=01a8ee70 edx=00000001 esi=00000020 edi=01a26e80
eip=0042dcaf esp=0018ff20 ebp=0018ff3c iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00200202
audit4_patient!CleanEventList+0xb:
0042dcaf 8bc3            mov     eax,ebx

Looking at the record pointed to by ebx, we had a reference to an event handle handy:

0:000> dd ebx L2
01a8ee70  00000001 00000928

  TSyncEventItem = record
    Lock: Integer;
    Event: Pointer;
  end;

Although Event is a Pointer, internally it’s just cast from an event handle. So I guess that we can probably find another reference to that handle somewhere in memory, corresponding to a TMonitor record:

  TMonitor = record
  strict private
    // ... snip ...
    var
      FLockCount: Integer;
      FRecursionCount: Integer;
      FOwningThread: TThreadID;
      FLockEvent: Pointer;
      FSpinCount: Integer;
      FWaitQueue: PWaitingThread;
      FQueueLock: TSpinLock;

And if we search for that event handle:

0:000> s -[w]d 00400000 L?F000000 00000928
01a8ee74  00000928 00000000 0000092c 00000000  (.......,.......
07649334  00000928 002c0127 0012ccb6 0000002e  (...'.,.........
0764aa14  00000928 01200125 004b8472 000037ce  (...%. .r.K..7..
07651e24  00000928 05f60125 01101abc 00000e86  (...%...........
08a47544  00000928 00000000 00000000 00000000  (...............

Now one of these should correspond to a TMonitor record. The first entry (01a8ee74) is just part of our TSyncEventItem record, and the next three don’t make sense given that the FSpinCount (the next value in the memory dump) would be invalid. So let’s look at the last one. Counting quickly on all my fingers and toes, I establish that that makes 08a47538 the start of the TMonitor record. And… so we search for a pointer to that.

0:000> s -[w]d 00400000 L?F5687000 08a47538
076b1d24  08a47538 08aa3e40 076b1db1 0122fe50  8u..@>....k.P.".

Just one! But here it gets a little tricky, because the PMonitor pointer is in a ‘hidden’ field at the end of the object. So we need to locate the start of the object.

0:000> dd 076b1d00
076b1d00  0122fe50 00000000 00000000 076b1df1
076b1d10  0122fe50 00000000 00000000 076b0ce0
076b1d20  004015c8 08a47538 08aa3e40 076b1db1
076b1d30  0122fe50 00000000 00000000 076b1d51
... snip ...

I’m just stabbing in the dark here, but that 004015c8 that’s just four bytes back smells suspiciously like an object class pointer. Let’s see:

0:000> da poi(4015c8-38)+1
004016d7  "TObject&"

Ta da! That all fits. A TObject has no data members, so the next 4 bytes should be the TMonitor (search for hfMonitorOffset in the Delphi source to learn more). So we have a TObject being used as a TMonitor lock reference. (Learn about that poi(address-38)+1 magic). But what other naughty object is hanging about, using this TObject as its lock?

0:000> s -[w]d 00400000 L?F5687000 076b1d20
098194b0  076b1d20 00000000 00000000 09819449   .k.........I...

Just one. Let’s trawl back in memory just a little bit and have a look at this one.

0:000> dd 09819480  
09819480  00f0f0f0 00ffffff 00000000 09819641
09819490  00447b7c 00000000 00000000 00000000
098194a0  00000000 098150c0 00448338 098195c8
098194b0  076b1d20 00000000 00000000 09819449
... snip ...

0:000> da poi(00448338-38)+1
004483c0  "TThreadList&"

And what does a TThreadList look like?

  TThreadList = class
  private
    FList: TList;
    FLock: TObject;
    FDuplicates: TDuplicates;

Yes, that definitely looks hopeful! That FLock is pointing to our lock TObject… I believe that’s called a Quality Match.

This is still a little bit too generic for me, though. TThreadList is a standard Delphi class used by the bucketload. Let’s try and identify who is using this list and leaving it lying about. First, we’ll quickly have a look at that TThreadList.FList to see if it has anything of interest — that’s the first data member in the object == object+4.

0:000> dd poi(098194ac)
098195c8  00447b7c 00000000 00000000 00000000
... snip ...
0:000> da poi(447b7c-38)+1
00447cad  "TList'"

Yep, it’s a TList. Just making sure. It’s empty, what a shame (TList.FCount is the second data member in the object == 00000000, as is the list pointer itself).

So how else can we find the usage of that TThreadList? Is it TThreadList referenced anywhere then? Break out the search tool again!

0:000> s -[w]d 00400000 L?F5687000 098194a8
076d3410  098194a8 00000000 09819580 00000000  ................

Yes. Just once, again. Again, we scroll back in memory to find the base of that object.

0:000> dd 076d33c0  
076d33c0  08abfe60 00000000 00000000 076d4c39
076d33d0  01616964 0000091c 00001850 00000100
076d33e0  00000001 00000000 00000000 00000000
076d33f0  076d33d0 00000000 01616d38 076d33d0
076d3400  00000000 00000000 00000000 00000000
076d3410  098194a8 00000000 09819580 00000000
076d3420  00000000 076d2ea9 0075e518 00000000
076d3430  00401ecc 00000000 00000000 00000000

There was a false positive at 076d33f8, but then magic happened at 076d33d0:

0:000> da poi(01616964-38)+1
016169e8  "TAnatomyDiagramTileLoaderThread&"

Wow! Something real! Let’s dissect this a bit. Looking at the definition of TThread, we have the following data:

076d33d0  01616964  class pointer TAnatomyDiagramTileLoaderThread
076d33d4  0000091c  TThread.FHandle
076d33d8  00001850  TThread.FThreadID
076d33dc  00000100  TThread.FCreateSuspended, .FTerminated, .FSuspended, .FFreeOnTerminate (watch that endianness!)
076d33e0  00000001  TThread.FFinished
076d33e4  00000000  TThread.FReturnValue
076d33e8  00000000  TThread.FOnTerminate.Code
076d33ec  00000000  TThread.FOnTerminate.Data
076d33f0  076d33d0  TThread.FSynchronize.TThread (= Self)
076d33f4  00000000  padding
076d33f8  01616d38  TThread.FSynchronize.FMethod.Code (= last Synchronize target method)
076d33fc  076d33d0  TThread.FSynchronize.FMethod.Data (= Self)
076d3400  00000000  TThread.FSynchronize.FProcedure
076d3404  00000000  TThread.FSynchronize.FProcedure
076d3408  00000000  TThread.FFatalException
076d340c  00000000  TThread.FExternalThread

Then, into TAnatomyDiagramTileLoaderThread:

076d3410  098194a8  TAnatomyDiagramTileLoaderThread.FTiles: TThreadList

So… we can tell that the thread has exited (FFinished == 1), and verify that by looking at running threads, looking for thread id 1850:

0:000> ~
.  0  Id: 1c98.1408 Suspend: 1 Teb: 7efdd000 Unfrozen
   1  Id: 1c98.16c8 Suspend: 1 Teb: 7efda000 Unfrozen
   2  Id: 1c98.11a0 Suspend: 1 Teb: 7ee1c000 Unfrozen
   3  Id: 1c98.1bbc Suspend: 1 Teb: 7ee19000 Unfrozen
   4  Id: 1c98.10dc Suspend: 1 Teb: 7ee04000 Unfrozen
   5  Id: 1c98.12f0 Suspend: 1 Teb: 7edfb000 Unfrozen
   6  Id: 1c98.1d38 Suspend: 1 Teb: 7efaf000 Unfrozen
   7  Id: 1c98.1770 Suspend: 1 Teb: 7edec000 Unfrozen
   8  Id: 1c98.1044 Suspend: 1 Teb: 7ede3000 Unfrozen
   9  Id: 1c98.bf4 Suspend: 1 Teb: 7ede0000 Unfrozen
  10  Id: 1c98.b3c Suspend: 1 Teb: 7efd7000 Unfrozen

Furthermore, the handle is invalid:

0:000> !handle 91c
Could not duplicate handle 91c, error 6

That suggests that the object has already been destroyed. But that the TThreadList hasn’t.

And sure enough, when I looked at the destructor for TAnatomyDiagramTileLoadThread, we clear the TThreadList, but we never free it!

Now, another way we could have caught this was to turn on leak detection. But leak detection is not always perfect, especially when you get some libraries that *cough* have a lot of false positives. And of course while we could have switched on heap leak detection, that involves rebuilding and restarting the process and losing the context along the way, with no guarantee we’ll be able to reproduce it again!

While this approach does feel a little tedious, and we did have some luck in this instance with freed objects not being overwritten, and the values we were searching for being relatively unique, it does nevertheless feel pretty deterministic, which must be better than the old “try-it-and-hope” debugging technique.