Category Archives: Bugs

Creating and using an Internationalized Domain Name in July 2024

We needed to setup a web host for the domain convert.ភាសាខ្មែរ.com for an upcoming project. It should have been simple, but this turned into quite the journey.

A very brief background on IDNs

Some background. Notice the use of Khmer characters: ភាសាខ្មែរ /pʰiesaa kmae/ or “Khmer Language”. ភាសាខ្មែរ.com is a domain I’ve owned for some time, having purchased it when I was learning Khmer. Because this domain names uses letters not found in the English alphabet, it is considered to be an Internationalized Domain Name (or IDN). Now, behind the scenes, ភាសាខ្មែរ is not stored using Khmer characters in a Domain Name System (DNS) record. For reasons of history, domain names are restricted to a handful of Latin script letters, digits, and hyphen, and so the domain must be re-encoded using a system known as ‘punycode‘ (truly!)

The punycode representation of ភាសាខ្មែរ is the rather bamboozling xn--j2e7beiw1lb2hqg.

Now, most of the time, this gory detail will be hidden from you in the user interface of your browser — but if you copy the URL from the address bar and paste it into a text editor, you’ll be presented with the gory details! (The domain punycoder.com allows you to easily play with other scripts and non-English-alphabet names and see how they are punycoded.)

We needed a backend host with Python for the project, which ruled out a lot of free options — particularly since we wanted to bind to this custom domain name. As I had access to an Azure subscription already, I went that way.

So first, I attempted to create the domain quickly in the Azure Portal web front end. I had no trouble creating https://khconvert.azurewebsites.net/ — took just a few seconds. Then I had to bind our domain convert.ភាសាខ្មែរ.com to this azure hostname. But computer said no:

The name convert.xn--j2e7beiw1lb2hqg.com is not valid.

Fine, perhaps the UI needed the domain name as a Unicode string.

Hmm. Blocked again.

Now, often the Azure Portal does not allow you to do things which can be done via their APIs or with their az CLI tool. So I tried again with az.

PS> az webapp config hostname add --hostname convert.xn--j2e7beiw1lb2hqg.com --webapp-name khconvert --resource-group <REDACTED> --verbose
The name convert.xn--j2e7beiw1lb2hqg.com is not valid.
Command ran in 4.228 seconds (init: 0.493, invoke: 3.736)

Okay then … perhaps I have to use the non-punycode version of the hostname? After all, that error in the Azure Portal was purely in the UI — it didn’t even call into the API.

And it worked!

Side track: the old Powershell console did not do well with rendering Khmer in its default settings…

So it looks like those were not question marks? Interestingly, I can paste Khmer characters into the console, but I can’t copy them back out from the command — those letters copy out as question marks. Although as you can see from a screenshot and corresponding text later, I can copy the response and show it correctly. (Admittedly, I was using the old Powershell console, not Terminal, which probably would have none of these problems.)

But it worked, hurrah! We are done!

Well, not quite. We need a TLS certificate for the site. Fortunately, Azure supports creating certificates automatically for web apps.

But not for IDNs.

And not even the CLI saved me this time.

PS> az webapp config ssl create --hostname 'convert.?????????.com' --name khconvert -g <REDACTED> --verbose
This command is in preview and under development. Reference and support levels: https://aka.ms/CLI_refstatus
Operation returned an invalid status 'Bad Request'
Content: {"Code":"BadRequest","Message":"Properties.CanonicalName is invalid.  Canonical name convert.ភាសាខ្មែរ.com includes at least one special character. Only letters and numbers are allowed.","Target":null,"Details":[{"Message":"Properties.CanonicalName is invalid.  Canonical name convert.ភាសាខ្មែរ.com includes at least one special character. Only letters and numbers are allowed."},{"Code":"BadRequest"},{"ErrorEntity":{"ExtendedCode":"51021","MessageTemplate":"{0} is invalid.  {1}","Parameters":["Properties.CanonicalName","Canonical name convert.ភាសាខ្មែរ.com includes at least one special character. Only letters and numbers are allowed."],"Code":"BadRequest","Message":"Properties.CanonicalName is invalid.  Canonical name convert.ភាសាខ្មែរ.com includes at least one special character. Only letters and numbers are allowed."}}],"Innererror":null}

At this point, I threw my hands in the air, put the domain behind a Cloudflare proxy and used their SSL certificate, and it just worked.

But it is no wonder there are so few IDNs. The experience is still frightful. The errors are weird. We have a long, long way to go.

(Oh, and do take a look at ភាសាខ្មែរ.com and convert.ភាសាខ្មែរ.com: the problems they are playing with have quite a lot to do with IDNs too!)

location.hash can be re-entrant on Safari Mobile

So Josh and I were observing some really strange behaviour in the Keyman for iOS beta. When typing rapidly on the touch keyboard, we would sometimes get the wrong character emitted. We could not see anything immediately wrong in the code. So Josh added some logging. Then things got really weird: we would get the start of a touch event, then before the touch event handler finished, we’d get logging that indicated another touch event was received.

Whoa. In JavaScript, that shouldn’t be possible, right?

Each message is processed completely before any other message is processed. This offers some nice properties when reasoning about your program, including the fact that whenever a function runs, it cannot be pre-empted and will run entirely before any other code runs (and can modify data the function manipulates). This differs from C, for instance, where if a function runs in a thread, it can be stopped at any point to run some other code in another thread.

— https://developer.mozilla.org/en-US/docs/Web/JavaScript/EventLoop

After we dug in further, we found that it was indeed possible for a new touch event (and perhaps others) to be received when location.hash was set from the JavaScript code, in Safari for iOS.

Here’s a fairly minimal repro. Load this up on your iPhone, and start rapidly touching the Whack div. You’ll probably need to use two fingers repeatedly to trigger the event (I can usually get it to happen with about 50-100 rapid touches). When it happens, you’ll get a log message with a call stack showing how the whackIt() function has apparently called itself!

<!doctype html>
<html>
  <head>
    <meta charset='utf8'>
    <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1, user-scalable=no" /> 
    <title>location.hash re-entrancy</title>
    <style>
      #whack { font-size: 64pt }
      #log, #stack { font-family: courier }
    </style>
  </head>
  <body>
    <div id=whack>Whack</div>
    
    <div id=log></div>
    
    <div id=stack></div>
    
    <script>
    
      var tick = 0, inWhack = false;
      
      function whackIt(e) {
        e.preventDefault();

        tick++;
        
        if(inWhack) {
          stack.innerHTML = 'Re-entrant event: '+(new Error()).stack;
          return false;
        }
        
        var localTick = tick;
        log.innerHTML = tick;

        inWhack = true;

        // Setting location.hash seemingly can cause the 
        // event queue to be polled, resulting in a 
        // re-entrant touch event
        location.hash = '#'+localTick;

        inWhack = false;
        
        if(localTick != tick) {
          // alert('We had a re-entrant event');
        }
        
        return false;
      }
      
      whack.addEventListener('touchstart', whackIt, false);
      whack.addEventListener('touchend', whackIt, false);
      
    </script>
  </body>
</html>

We’ll be reporting this to Apple… Just a heads up.

The Case of the Overly Busy Process

The other day, I was running a routine Process Monitor (Procmon) trace to debug an issue in Keyman, when I noticed something strange: over 50% of the events displayed with the default filter (which excludes a lot of system-level noise and procmon-related feedback) were coming from a single process: services.exe.

You can see in the image below I’ve added services.exe to the filter (Process Name is services.exe), and then the status bar shows 52% of events belonging to it.

Puzzled, I set aside some time to dig a little further (which means I went to bed late one evening). Watching Process Explorer, I could see that services.exe and wmiprvse.exe were between them consuming about 10% of my CPU. This did not seem normal. Nor did it seem to be a good thing for my battery life.

Deciding to examine the trace a little, I filtered out common registry keys and events, such as RegCloseKey, which made it easier to spot a pattern. It became obvious that every 5 seconds, services.exe, with the help of wmiprvse.exe, would enumerate the list of services from the registry, sending about 120,000 events to the Procmon trace in the process. Nearly 80% of the events captured each minute by Procmon were generated by either services.exe or wmiprvse.exe!

Nearly 80% of the events captured each minute by Procmon were generated by either services.exe or wmiprvse.exe!

Given that wmiprvse.exe, the Windows Management Instrumentation (WMI) provider host, was involved, it seemed likely that there was a process issuing WMI queries against the Services provider, such as you can do with PowerShell:

Get-WmiObject Win32_Service | Format-Table Name, DisplayName, State, StartMode, StartName

It was just a matter of figuring out which one.

I started off by trying to dig into WMI logging. I don’t know if you’ve ever dug into that, but it’s huge, complex and somewhat impenetrable. It is likely that with the right knowledge I could have issued a command that gave me a list of queries being issued and who was issuing them. But I have not yet acquired that knowledge, sadly, and late at night my brain did not feel up to the attempt.

It seemed easier to instead to use a process of elimination of processes (yeah, I did that on purpose). I started the CPU monitor in Process Explorer for the services.exe process, which showed lovely 5 second spikes.

Then I started to stop various services, watching to see if the spiking stopped. It didn’t. Once I was down to a handful of critical services (do I really need to run the Firewall service?) I started looking at background user-level processes, such as the icons sitting in the System Notification Area.

And here I hit gold. After shutting down a few, including my own programs, with no noticeable change, I shutdown MySQL Notifier 1.1.7.

All of a sudden, CPU activity dropped to zero on the services.exe process, and the next Procmon trace showed a mere 85 events in a minute for the services.exe and wmiprvse.exe pair.

Success!

I checked the MySQL Notifier forums and saw no discussion of this issue, but I found a closed bug report in the bug database. I’ll have to add my comment to the bug report.

Once again, Procmon comes to the rescue 🙂 I’m looking forward to the increased battery life already!

I know it’s not the most elegant way to debug a problem, but sometimes it is quicker and easier than the alternatives. It’s especially easy to use process of elimination like this late at night, without having to think hard about it. 😉

Working around Delphi’s default grid scrolling behaviour

Delphi’s T*Grid components have an annoying little feature whereby they will scroll the cell into view if you click on a partially visible cell at the right or the bottom of the window. Then, this couples with a timer that causes the scroll to continue as long as the mouse button is held down and the cell it is over is partially visible. This typically means that if a user clicks on a partially visible cell, they end up selecting a cell several rows or columns away from where they intended to click.

In my view, this is a bug that should be fixed in Delphi. I’m not the only person who thinks this. I’ve reported it to Embarcadero at RSP-18542.

In the meantime, here’s a little unit that works around the issue.

{
  Stop scroll on mousedown on bottom row of grid when bottom row
  is a partial cell: have to block both initial scroll and timer-
  based scroll.

  This code is pretty dependent on the implementation in Vcl.Grids.pas,
  so it should be checked if we upgrade to new version of Delphi.
}

{$IFNDEF VER320}
{$MESSAGE ERROR 'Check that this fix is still applicable for a new version of Delphi. Checked against Delphi 10.2' }
{$ENDIF}

unit ScrollFixedStringGrid;

interface

uses
  System.Classes,
  Vcl.Controls,
  Vcl.Grids,
  Winapi.Windows;

type
  TScrollFixedStringGrid = class(TStringGrid)
  private
    TimerStarted: Boolean;
    HackedMousedown: Boolean;
  protected
    procedure MouseDown(Button: TMouseButton; Shift: TShiftState; X: Integer;
      Y: Integer); override;
    procedure MouseMove(Shift: TShiftState; X: Integer; Y: Integer); override;
    function SelectCell(ACol, ARow: Longint): Boolean; override;
  end;

implementation

{ TScrollFixedStringGrid }

procedure TScrollFixedStringGrid.MouseDown(Button: TMouseButton;
  Shift: TShiftState; X, Y: Integer);
begin
  // When we first mouse-down, we know the grid has
  // no active scroll timer
  TimerStarted := False;

  // Call the inherited event, blocking the default MoveCurrent
  // behaviour that scrolls the cell into view
  HackedMouseDown := True;
  try
    inherited;
  finally
    HackedMouseDown := False;
  end;

  // Cancel scrolling timer started by the mousedown event for selecting
  if FGridState = gsSelecting then
    KillTimer(Handle, 1);
end;

procedure TScrollFixedStringGrid.MouseMove(Shift: TShiftState; X, Y: Integer);
begin
  // Start the scroll timer if we are selecting and mouse
  // button is down, on our first movement with mouse down
  if not TimerStarted and (FGridState = gsSelecting) then
  begin
    SetTimer(Handle, 1, 60, nil);
    TimerStarted := True;
  end;
  inherited;
end;


function TScrollFixedStringGrid.SelectCell(ACol, ARow: Longint): Boolean;
begin
  Result := inherited;
  if Result and HackedMousedown then
  begin
    // MoveColRow calls MoveCurrent, which
    // calls SelectCell. If SelectCell returns False, then
    // movement is blocked. But we fake it by re-calling with Show=False
    // to get the behaviour we want
    HackedMouseDown := False;
    try
      MoveColRow(ACol, ARow, True, False);
    finally
      HackedMouseDown := True;
    end;
    Result := False;
  end;
end;

end.

 

When ញ៉ាំ meets ញ៉ំា

The Khmer script was added to the Unicode standard in September 1999. Today, nearly 18 years later, operating system renderers still get it wrong.

This is a quick post to document the difference in how several Khmer words are wrongly rendered on different current operating systems. I ran these tests on Windows 10 (10.0.14393), Android 7.1.1 Nougat, iOS 10.2.1, Mac OS X Sierra <> and Ubuntu 16.04 LTS with Firefox 47. The good news is that Windows 10 and Ubuntu passed all the tests (bar a font style issue with Leelawadee UI). Android passed nearly everything, except the bad encoding test.

Now, admittedly, the rules around triisap (U+17CA) and muusikatoan (U+17C9) are very complex. The Unicode standard description covers most of the difficulties, but not all of them.

Muusiaktoan is also sometimes called ធ្មេញកណ្ដុរ /tmɨɲ kɑndao/ – rat’s teeth, which is a fun name.

On to the words. In every case, the DauhPenh rendering is correct.

ញ៉ាំ /ɲam/ To eat

U+1789 U+17C9 U+17B6 U+17C6

ស៊ី /sii/ To eat (for young)

U+179F U+17CA U+17B8

As of Mac OS X Sierra, /sii/ now displays correctly. But contrast with /ʔum/, /ʔom/ below.

អ៊ំ /ʔum/, /ʔom/ Uncle, aunt

U+17A2 U+17CA U+17C6

Note how Leelawadee UI renders this wrongly; but that is a font rather than a renderer bug.

ប៊ី /bii/ A type of egg roll

U+1794 U+17CA U+17B8

ប៉ី /pəy/ A type of wind instrument

U+1794 U+17C9 U+17B8

As of Mac OS X Sierra, /pəy/ now displays correctly. But contrast with /bii/ above!

Yum yum yum

ញ៉ាំ /ɲam/ To eat

I’d like to pull out the word ញ៉ាំ for further analysis. Every operating system has some trouble with this word, because it could be encoded in several different ways. The correct way works on everything except iOS and Mac OS X. The incorrect encodings should really display wrongly, but none of the renderers complain about both invalid forms!

Correct order (ញ៉ាំ)

U+1789 U+17C9 U+17B6 U+17C6

Incorrect order (ញ៉ំា)

U+1789 U+17C9 U+17C6 U+17B6

Incorrect vowel (ញុំា)

U+1789 U+17BB U+17C6 U+17B6

In this instance, The DauhPenh rendering is appropriate for the first and second lines; the Apple rendering is ironically most appropriate for the third line!

Many thanks to Makara for his suggestion on the second incorrect rendering; I updated this post shortly after initial posting to include the extra example. There are other possible letter orders which may or may not display “correctly”; I will leave finding these as an exercise for the reader.

ZWNJ FTW

Here’s one I’ll examine in detail another time. Some words can be written in two different ways, neither really incorrect. The Unicode standard caters for these by allowing for insertion of a Zero Width Non Joiner (U+200C) to force the superscripted form of triisap (៊) or muusikatoan (៉). Windows 10’s Leelawadee UI font gets this one wrong (but its DauhPenh font doesn’t).

អ‌៊ី or អ៊ី /ʔii/ An exclamation of surprise

U+17A2 U+17CA U+17B8
ZWNJ
U+17A2 U+200C U+17CA U+17B8

Note: table ZWNJ character order corrected as per comment by Olivier Berten.

Don’t forget to navigate to about:blank when embedding IWebBrowser2

Today I spent several hours trying to figure out why an embedded web browser component (in this case TEmbeddedWB) in a Delphi test app never received the appropriate IHttpSecurity and IWindowForBindingUI QueryService requests.

I was doing this in order to provide more nuanced handling of self-signed certificates in an intranet context. We all do this, right? Here the term “nuanced” means “Of course I trust self signed certificates on my intranet, don’t you?” Feel free to rant and rave on this. 😉

But no matter what I did, what incantations I tried, or what StackOverflow posts I perused, I was unable to find an answer. Until finally I stumbled on a side comment in a thread from 2010. Igor Tandetnik notes that:

Right after creating the control, navigate it to about:blank. Right after that, navigate it to the page you wanted to go to. It’s a known problem that IServiceProvider doesn’t work for the very first navigation.

And this was something that I kinda knew in the back of my head, but of course had forgotten. Thank you Igor.

This post would not be complete without some splendiferous code. Just for reference, it’s so simple if you don’t blank out and forget about:blank.

unit InsecureBrowser;

interface

uses
  Winapi.Windows,
  Winapi.Messages,
  Winapi.Urlmon,
  Winapi.WinInet,
  System.SysUtils,
  System.Variants,
  System.Classes,
  Vcl.Graphics,
  Vcl.Controls,
  Vcl.Forms,
  Vcl.Dialogs,
  Vcl.OleCtrls,
  Vcl.StdCtrls,
  SHDocVw_EWB,
  EwbCore,
  EmbeddedWB;

type
  TInsecureBrowserForm = class(TForm, IHttpSecurity, IWindowForBindingUI)
    web: TEmbeddedWB;
    cmdGoInsecure: TButton;
    procedure webQueryService(Sender: TObject; const [Ref] rsid,
      iid: TGUID; var Obj: IInterface);
    procedure FormCreate(Sender: TObject);
    procedure cmdGoInsecureClick(Sender: TObject);
  private
    { IWindowForBindingUI }
    function GetWindow(const guidReason: TGUID; out hwnd): HRESULT; stdcall;

    { IHttpSecurity }
    function OnSecurityProblem(dwProblem: Cardinal): HRESULT; stdcall;
  end;

var
  InsecureBrowserForm: TInsecureBrowserForm;

implementation

{$R *.dfm}

function TInsecureBrowserForm.GetWindow(const guidReason: TGUID;
  out hwnd): HRESULT;
begin
  Result := S_FALSE;
end;

function TInsecureBrowserForm.OnSecurityProblem(dwProblem: Cardinal): HRESULT;
begin
  if (dwProblem = ERROR_INTERNET_INVALID_CA) or
     (dwProblem = ERROR_INTERNET_SEC_CERT_CN_INVALID)
    then Result := S_OK
    else Result := E_ABORT;
end;

procedure TInsecureBrowserForm.webQueryService(Sender: TObject;
  const [Ref] rsid, iid: TGUID; var Obj: IInterface);
begin
  if IsEqualGUID(IID_IWindowForBindingUI, iid) then
    Obj := Self as IWindowForBindingUI
  else if IsEqualGUID(IID_IHttpSecurity, iid) then
    Obj := Self as IHttpSecurity;
end;

procedure TInsecureBrowserForm.cmdGoInsecureClick(Sender: TObject);
begin
  web.Navigate('https://evil.intranet.site/');
end;

procedure TInsecureBrowserForm.FormCreate(Sender: TObject);
begin
  web.Navigate('about:blank');
end;

end.

 

My favourite debugging story

This was some years ago, when I was living in Vientiane, the capital of the Lao Peoples’ Democratic Republic. It was 1994 or thereabouts – just prior to the release of Windows 95. I had written a piece of software called “Keyman” which was being increasingly used to type in Lao in Windows 3.1, overloading characters in the 128-255 range of the standard US English character set at the time (I don’t want to be too technical here). Before Unicode.

The owner of a local computer store had had reports of an issue with Keyman from one of their clients in the provincial capital of Savannakhet, about a one hour flight from Vientiane in a small plane. The technical minutiae of the problem escape me now, but it was something to do with a certain set of keystrokes which gave the wrong output in some applications – I think Excel. The report had been communicated over the telephone to the computer shop technical staff, and then translated into English for my benefit – as my Lao was probably not good enough to really get the detail. So as you can imagine, Chinese Whispers is a good way to describe the final report I received.

I tried to diagnose the problem from the description, and tried to reproduce it on my computer, but could not figure it out.

Now it is important to remember that Laos in 1994 was still pretty much unknown to the outside world. There were few tourists; it was (and is) a communist country, at least in principle. Things there didn’t work quite the same as in Australia. There was no Internet access in Laos at that time, telephones were unreliable and the use of modems was technically illegal. This meant that remote diagnosis was only possible by means of telephoned conversations over noisy phone lines, by fax, or with posted letters. The post often took weeks, even in-country.

So after a few days of fruitless telephoning back and forth, the owner of the computer shop suggested I accompany him on a trip down to Savannakhet. (From memory, he was already planning to visit). I was but a callow youth, of 17, and so this was a fantastic opportunity!

When we met up at the airport, the first thing I remember was standing in the security line behind a Lao businessman who caused a bit of a ruckus at the hand luggage screening, because his briefcase had two pistols in it. This seemed a little unusual, even in Laos. After some discussion, his pistols were removed from his hand luggage and given to a guard, who told him he could not take them on the plane because there was not a separate luggage hold. I don’t know what happened to them after that as we were ushered through the security.

When our plane started boarding, a second problem arose. It appeared that the flight had been overbooked, or perhaps they’d substituted a smaller aircraft. The plane we could see was a Xian Y-7, a Chinese clone of a Russian Antonov An-24. The Wikipedia page linked above shows a picture, coincidentally enough of a Lao Aviation plane, perhaps the very plane we were to fly on (they only had 4).

But, as I said, the plane was overbooked, and we ended up in the group of about 10 that didn’t get onto the plane. Pistol-man was in the group that boarded the Y-7, and we didn’t see him again.

Fortunately for us, Lao Aviation had a solution to the problem. They simply rolled a second plane out of the hanger, a Harbin Y-12 this time, and fired it up.

Well, they tried to fire it up. It coughed and spluttered, and lots of black smoke poured out of the engines, but it didn’t start. Boh pen nyang. They pushed it back into the hanger, and rolled out yet another Y-12.

At this point I was feeling a little nervous.

The thired plane coughed and spluttered, poured out lots of black smoke, but it started! After a minute, they shut off the engines and asked us to board.

You can see in the picture below how part of the engine cowling is painted black. You can also see, if you look closely, how there are black smudge marks around that black painted area. Yeah. Smoke. I guess that the smoke mustn’t be a big problem, but it wasn’t inspiring at the time.

Lao_Aviation_Harbin_Y-12_Sibille-1

https://upload.wikimedia.org/wikipedia/commons/4/4b/Lao_Aviation_Harbin_Y-12_Sibille-1.jpg

Image © 2000 Regis Sibille, used under CCSA. (Another picture: http://www.airliners.net/photo/Iran—Revolutionary/Harbin-Y12-II/1503896/L/)

We rolled out and took off moments after the larger first plane. For a while, we could see the larger plane ahead and slightly above us – I don’t know why they didn’t go straight up to cruising altitude as the Y-7 is a lot faster than the Y-12. But eventually the Y-7 was out of sight. The scenery was in places spectacular. As I recall, the plane stayed in Lao airspace for the whole flight, despite this making the flight significantly longer.

Arriving in Savannakhet, we first travelled to the house of a friend of the computer store owner. This man happened to be one of the richest men in southern Laos. He had a beautiful house, filled with beautifully carved tables, paintings and collected antiques. After a brief meeting there, we were escorted by this man to a café in the city for a coffee. Well, some of us drank coffee; I didn’t. I was but 17 and at that age drank far too much Pepsi.

At this sidewalk café, an interesting encounter occurred, which has stuck with me. A street sweeper stopped and ordered a drink, and sat at the same table as this very wealthy man, and they struck up conversation. For some time, all at the table talked. The friendly interactions between two very different social classes was remarkable to me at the time – especially coming from Thailand where the social strata were clearly delineated.

Finally, social requirements met, we made our way to the computer with the problem. It was about 3 or 4 flights up dusty stairs in one of the tallest buildings in the city. There was a lift, but use of it was definitely not recommended. The problem was demonstrated to me, and I was able to observe there what had flummoxed me from afar. After just a few minutes, I realised what the problem was, and had enough information to fix it. I didn’t have my laptop with me so I had to write down some notes – and then we left them, a little sad that we couldn’t fix the problem immediately.

What was the problem? I actually don’t remember the detail. I just remember how we got there and back again!

We took a ferry across the river to Thailand, and took a bus to Nakhon Phanom. There were two reasons: first, my friend the computer shop owner wanted to visit some relatives there, and second, the roads in Laos were at the time in such poor condition that travel on them was best avoided if an alternative was available. Thai roads were busy but generally in excellent condition.

Once in Nakhon Phanom, it was a short motorcycle taxi ride to the relatives’ house by the river. We stopped there for a couple of hours and drank tea with them (yes, even I, Mr Pepsi Boy drank tea).

But when we tootled back to the bus stop, we found our bus had left a few minutes earlier than we had planned!

This was not a big problem. A bystander offered to chase down the bus in his pickup truck. This pickup was sparkling clean, had bright alloy wheels with a thin slick of rubber spray painted on, and a lowered chassis, so much so that the wheels would have been superfluous if they hadn’t been required for their locomotive capability.

pickup

It wasn’t this car, but it could well have been its older brother. Now in Thailand, the buses moved fast. Especially once they got out onto the open road. I’ve been on Thai buses doing 150 km/h or more.

So our intrepid young driver started chasing down the bus, flying down the city streets at over double the speed limit, braking hard for corners and easily avoiding the (fortunately) light traffic, and once he got out onto the open road he was eager to show us what his car would do. So here we were, flying down a Thai highway in a stranger’s car doing a ridiculous speed, chasing down a bus driver that didn’t know we existed. I’m pretty sure my parents would not have been pleased.

At the speed we were doing, we caught up to the bus pretty easily. After some vigorous flashing of the headlights, the bus’s left indicator came on, and it slowed and finally stopped. My friend paid the pickup race driver a token of our appreciation, and with slightly wobbly legs we climbed onto the bus, and off we went to Nong Khai.

We arrived in Nong Khai well after midnight, and ended up at a noisy Thai transit hotel, in a room without windows a couple of floors above the nightclub. One bed: a queen bed.

Now, remember, I was a callow 17 year old youth. The idea of sharing a bed with any man was pretty terrifying. But my friend, who I guess was in his 40s at that stage, kindly noted my abject and unwarranted fear, and gave me the whole bed – I think he sat in the chair and dozed. Not fair, I know.

At about 4am the nightclub finally quietened down, and at about 6am we were awoken by the daily noise that accompanies the start of every day – car doors slamming, shouts, trucks reversing. So we gave up on sleep, got up and made our way back across the border to Vientiane.

Upon returning to Vientiane, I fixed the problem in the Keyman code in a few minutes, prepared a new version on a floppy disk, and rode my bicycle over to the computer shop to deliver it. The computer shop sent the disk on to the user in Savannakhet, and as far as I know, that was that.

That’s how debugging used to work in the nearly olden days. None of these fancy remote desktop VPN SSH thingies. It was a lot more fun.

Windows 10 shows a blank screen after login

Quick post for future reference and in case it helps other users.

Got to work this morning, and couldn’t login to my Windows 10 machine. Windows Updates had been installed overnight (oh dear).

The symptoms were slightly different to most of the other reports I’ve seen online (Google Windows 10 blank screen after login). I could login on one account all fine, but my primary account would just go to a black screen (with cursor). Waited quite some time (10 minutes or more) with no change. Rebooted. No change.

Pressing Ctrl+Alt+Del would bring up Windows Security, and from there I could switch user or go back to the lock screen, but attempting to log out would fail: the “wait for logout” spin appeared but never logged out. Other options from that screen had no effect.

I vaguely tried multi monitor options (Windows+P key combination), typing password into the blank screen, and other magic mummery without effect (and without expectation of success).

Eventually I logged in on the working account, turned off Windows firewall, enabled Remote Registry, and got to work with SysInternals tools. I logged back out of that account and used the command pslist \\machine -t from a working computer to see what processes were running.


pslist v1.3 - Sysinternals PsList
Copyright (C) 2000-2012 Mark Russinovich
Sysinternals - www.sysinternals.com

Process information for <machine>:

Name                             Pid Pri Thd  Hnd      VM      WS    Priv
Idle                               0   0   4    0      64       4       0
  System                           4   8 141 1217  393012  271644    1560
    smss                         316  11   3   49    4864     592     376
csrss                            428  13  11  415  163348    3240    1620
wininit                          500  13   4   91   43984    2792    1232
  services                       572   9   9  294   35020    7900    4648
    svchost                      304   8  35  956  184400   23016   13240
    svchost                      356   8  17  321   82168    6432    3248
    svchost                      496   8  26 1043  204540   24268   20616
    svchost                      748   8  30  773   87432   16040   10172
      ShellExperienceHost       7032   8  44  876  369236   35880   58392
      WmiPrvSE                  8876   8   6  144   33264    8172    1880
      SearchUI                  9820   8  27  900  362948   37788   48896
    svchost                      812   8  18  528   59364   11544    9764
    svchost                      936   8  52 1346 1417096   26592   22524
    svchost                      968   8  78 4722 1321944   71912  797808
      taskhostw                 3952   8   8  291  275572   10268    6492
    svchost                     1064   8  37 1130  719392   25880   12960
      WUDFHost                  1816   8   8  476   38964    3852    2212
      dasHost                   2084   8  13  302   37880    7968    3764
    mvbtrcsvcx64                1108   8   3  107   58624    1608    1124
    svchost                     1344   8  14  527  217200   29352   56716
    svchost                     1348   8  29  605  167772   23016   18484
    spoolsv                     1792   8  24  576   91808   13972   11384
    svchost                     2156   8  48  485 1200004   19240    9308
    MsMpEng                     2268   8  45 1122  523848  127360  144964
    svchost                     2684   8  15 3980  303696   53068   95100
    NisSrv                      3084   8  10  287   71416   10276   23812
    SearchIndexer               5800   8  50 1045  434516   77688   95752
      SearchProtocolHost        9568   4  10  353   74552   12456    2240
      SearchFilterHost          9856   4   4  114   35096    6560    1208
    officeclicktorun            8500   8  18 3570  213028   27932   48116
    svchost                    11928   8   5  128   23372    6500    1276
  lsass                          580   9  14 1272   58540   17008    9876
csrss                           3356  13  15  634  160220   16276    2520
winlogon                        3396  13   6  229   65460   12380    2472
  dwm                           3520  13  12  332  282440   34416   62088

Nothing obviously wrong that I could see. I then ran pskill \\machine 3396 to kill winlogon, in the hope that at least that would cause the account to log out.

And of course it did.

The surprise was that after that I was able to login…

Unfortunately, I have not yet traced what caused the lock but at least this was an answer for me, which hopefully may help someone else one day.

Possibly related: I did find this in the Event Log:

Application pop-up: ShellExperienceHost.exe – Application Error : The instruction at 0x00007FFE569D50CB referenced memory at 0x0000000000000000. The memory could not be read.

And, an unhelpful error from win32k:

The description for Event ID 267 from source Win32k cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

The specified resource type cannot be found in the image file

Fixing the incorrect client size for Delphi VCL Forms that use styles

Delphi XE2 and later versions have a robust theming system that has a frustrating flaw: the client width and height are not reliably preserved when the theme changes the border widths for dialog boxes.

For forms that are sizeable this is not typically a problem, but for dialogs laid out statically this can look really ugly, as shown in this Stack Overflow question.

The problem in pictures

Here’s a little form, shown in the Delphi form designer. I’ve placed 4 buttons right in the corners of the form. I’m going to populate the Memo with notes on the form size at runtime.

Design time form with four buttons at corners

When I have no custom style set to the project (i.e. “Windows” style), I can run on a variety of platforms and see the buttons are where they should be. Shown here on Windows 10, Windows 7 and Windows XP (just because):
Windows theme form on Windows 10Windows theme form on Windows 7

Windows theme form on Windows XP

But when I apply a custom style to the project — I chose “Glossy” — then my dialog appears like so, instead:

Glow theme form on Windows 7

You’ll note that the vertical is adjusted but the horizontal is not: Button2 and Button4 are now chopped off on the right. Because we are using themes, the form looks identical on all platforms.

This problem has not been addressed as of Delphi XE8.

The workaround

For my needs, I found a workaround using a class helper, which can be applied to the forms which need to maintain their design-time ClientWidth and ClientHeight. This is typically the case for dialog boxes.

This workaround should be used with care as it has been designed to address a single issue and may have side effects.

  • It will trigger resize events at load time
  • Setting AutoScroll = true means that ClientWidth and ClientHeight are not stored in the form .dfm, and so this does not work.
  • It may not address other layout issues such as scaled elements scaling wrongly (I haven’t tested this).
type
  TFormHelper = class helper for Vcl.Forms.TCustomForm
  private
    procedure RestoreDesignClientSize;
  end;

procedure TfrmTestSize.FormCreate(Sender: TObject);
begin
  RestoreDesignClientSize;
end;

{ TFormHelper }

procedure TFormHelper.RestoreDesignClientSize;
begin
  if BorderStyle in [bsSingle, bsDialog] then
  begin
    if Self.FClientWidth > 0 then ClientWidth := Self.FClientWidth;
    if Self.FClientHeight > 0 then ClientHeight := Self.FClientHeight;
  end;
end;

After adding in this little snippet, the form is now restored to its design-time size, like thus:

Fixed glow theme form on Windows 7

Success 🙂