Spy++ in the case of the missing message

Yesterday, I wrote about WM_QUERYENDSESSION and the app that would not shut down. In that blog, I found a thread that never checked its message queue, and this meant that Windows thought the app was not responding.

So why didn’t Spy++ show the sent message in its log when it was sent? Spy++ uses a SetWindowsHookEx(WH_CALLWNDPROC) hook to log messages sent to windows. It turns out that in Windows NT 4.0 and later versions, the WH_CALLWNDPROC hook is called just before the message is sent to the window procedure. So our message never made it to the hook.

However, in Win9x and NT 3.51, the hook was called when the SendMessage function was called. Spy++ has its heritage in that era. Perhaps now the message delivery title in Spy++ should be changed from “sent” to “received” in order to clarify this difference.

A debug tool wish list item: MsgMon

After running through a painful message debugging session today, I realised again that at the top of my debug tool wishlist tool is MsgMon, a “Procmon for messages”. Spy++ and Winspector Spy are just not up to the task. Once you start using Procmon, with its caching of call stacks, filtering, phenomenal performance and more, life seems very hard each time you start Spy++.

I would love to write this tool, if only I had the time… It would be nice to be able to use the existing Procmon framework as it is so solid…

Debugging an application that would not behave with WM_QUERYENDSESSION

I was recently tasked with addressing a problem with an application that would not shut down when given a WM_QUERYENDSESSION message. On the surface, this seemed easy enough to fix — just make sure that the WM_QUERYENDSESSION message was appropriately handled by all the top level windows in the application.

Step 1 was to add a handler into the main top level window that avoided the normal shutdown dialog boxes that warned of backup and similar tasks. Where possible, Microsoft advises, you should not block the shutdown of Windows with dialog boxes, etc.

That done, I tested and found that the application was still not shutting down when it was supposed to. So I fired up Spy++ and got started on watching the WM_QUERYENDSESSION chatter in the application. Here I ran into my first little catch-22. I decided that I wanted to do a real shutdown, not a simulated one, to make sure that I was testing the correct scenario.

Problem was, Spy++ was always being shut down before I could capture the messages needed in my target application. Eventually I realised that I needed to start Spy++ after my application, because Windows would ask each application in turn to WM_QUERYENDSESSION, and as we know, my application would cancel the process because it wasn’t accepting a shutdown! Then I realised I could also make sure the shutdown was cancelled even when my application behaved properly, simply by loading Notepad after my application and typing a character or two into it. Then during the shutdown sequence, Iwould be asked if I want to save changes to the text document, and I could just click Cancel. Crude but effective.

With that out of the way, I ran through a Spy++ session and was able to spot an offending window that returned 0 to WM_QUERYENDSESSION. A little tracing and I found a utility window procedure that failed to call DefWindowProc. Naughty. Easily solved.

Well so I thought. Started testing again. The application was still failing to shutdown. I ran through Spy++, traced all the WM_QUERYENDSESSION messages for the process, and every single window procedure responded to the message, virtually instantly, with a nice clean 1, indicating that the window was happy to end the session. After 5 seconds, Windows would show its “Application is not responding” message, indicating that it had not received a response from one of the top level windows.

This was rather puzzling. I threw in a bit of logging into a dll using SetWindowsHookEx for WH_CALLWNDPROC and WH_CALLWNDPROCRET, just in case I was missing something with Spy++. No difference, except that it was somewhat easier to read.

So then I got thinking. This application runs quite a few background threads for loading data from a database. What if, I wondered, one of those threads had a window created but was not running a message loop? It sounded logical but I wasn’t very confident that that was the cause. Problem was, how to tell? I looked for ways to check the queue status for a thread in Windbg but could not find an easy solution. Eventually I decided to add some debugging into the common ThreadProc wrapper for the application, called GetQueueStatus(QS_SENDMESSAGE), and logged the response. Re-ran the application, and bingo, there was a thread closing down with a message in its queue. No prizes if you guess what that message was!

After that, it was simple. Reviewing the thread’s main function I quickly discovered a call to WaitForSingleObject. The thread did not appear to create any windows in any of our code, but it did use COM in its database connectivity, which meant that a helper window was created behind the scenes. I changed the WaitForSingleObject to a MsgWaitForMultipleObjects, and processed any outstanding messages when signalled with a PeekMessage loop.

And then it finally worked (and I forgot to start Notepad, so Windows restarted…) But it really stumped me for a bit today.

Feeling wasteful?

If you have the urge to be wasteful, here’s something fun you can do that may help.

Start Visual Studio’s “Create GUID” utility (Tools|Create GUID) and click New Guid a few hundred times.

As you wantonly create hordes of GUIDs that, unloved, instantly disappear into the ether, never to be seen again, you should take the opportunity to reflect on this task, being one of life’s more fruitless activities, and soon you will be feeling much better.

If this still doesn’t help, write a little program that calls CoCreateGuid, hundreds, millions, or even billions of times. You can even let the program run by itself, unmonitored and unchecked, gleefully consuming this precious resource, while you read War and Peace in its entirety. If that doesn’t fix it for you, nothing will.

PHP security updates are like malaria treatments

Applying PHP security updates is somewhat like taking a malaria treatment: they are, temporarily, worse than the disease itself. Let me explain.

Malaria is not a nice disease. I have had malaria a couple of times. We treated it with chloroquine (this was a few years ago). The treatment dose of chloroquine makes you feel worse than the malaria itself. But then you get better.

A PHP security hole is obviously a big issue for your average PHP site. I have had to apply patches to address these holes numerous times in the last few years. Unfortunately, it seems that each patch version for PHP introduces either new bugs or changes the published API. This causes all sorts of chaos and panic when the upgrade goes through, and lots of scrambling to fix a site that no longer works correctly. Sometimes you may not find the problem for several weeks in an infrequently used area of the site, so running a test server does not address this (besides, who wants to leave a known security hole online for several weeks?)

For example, PHP 5.2.7 was released to address a number of bugs and security holes but then was removed from distribution 3 days later because of an introduced bug changing the behaviour of magic quotes. That didn’t affect me because I did not use magic quotes… (Magic quotes were a majorly broken silly idea in the first place, but even worse is making it a configurable option so any code that I write has to test the setting… But let’s not get distracted.)

Or, to take an even more serious example, strtotime function return values changed in 5.1.0. As of 5.1.0, when strtotime is passed an invalid date, it returns FALSE instead of -1. This change was made without notice, and as far as I can tell, without any reference whatsoever in the huge changelog or even in bugs referenced in the changelog. That would have been better in the first place but this type of breaking change should never be made otherwise. I shouldn’t have to review all the changes to the PHP documentation, and then audit all 150,000+ lines of PHP code each time we update PHP!

That’s just two of the more obvious examples of the horrible PHP upgrade situation. Every time I have to upgrade, I just hold my breath and hope that no one has made any more silly breaking changes.

Why on earth do Network Solutions use so many domains?

I receive (too) many emails from Network Solutions about the various domains I own. Now, before you ask, these are not phishing emails — I have received those too — and I have checked out each message carefully.

In each message from Network Solutions they seem to have created yet another new domain name. It seems that they just can’t help themselves: “hey we’re a registrar, let’s go register another random domain name and tell our customers to use it!”

Here’s a list of a few of the domains just from their recent messages:

  1. www.networksolutions.com (of course) – and various subdomains, ok, I can cope with that!
  2. www.networksolutionspassword.info (how dodgy does that sound to you?)
  3. www.mysolutionspot.com (some marketing guff I guess)
  4. www.domainnamedate.com (why is this domain needed?)
  5. www.networksolutionsretail.com (why not retail.networksolutions.com?)

Now I recently received a message from them warning me about phishing messages. The basic test for a phishing message is to ask whether the domain names referenced in the message are legitimate — and how can we tell? Network Solutions use so many names that it’s just not possible to tell without a lot of work and even some danger.

So what’s the answer? Ignore all their silly domain names and just visit www.networksolutions.com…. or transfer to another registrar.

How to connect to a Netgear DG834G router in Windows 7

Trying to connect my Windows 7 beta AspireOne to our wireless network (with a Netgear DG834G router), was receiving an unspecified error – “Windows failed to connect”. After playing with a multitude of settings, including security, access control, manual configuration and more, I discovered that changing wireless mode on the router from “g only” to “g & b” solved all my problems…

After that I went back and turned on all the security settings on the router again!