Category Archives: Windows

Substituted layouts in Text Services Framework

There is a tantalising parameter in the ITfInputProcessorProfileMgr::RegisterProfile function called hklSubstitute.  The description (used to) read “The substitute keyboard layout of this profile.   When this profile is active, the keyboard will use this keyboard layout instead of the default keyboard layout.  This can be NULL if your text service doesn’t change the keyboard layout.”

This functionality also surfaces in ITfInputProcessorProfileSubstituteLayout::GetSubstituteKeyboardLayout and ITfInputProcessorProfiles::SubstituteKeyboardLayout.

From the minimal description in MSDN, this sounds like a feature that we want to use in Keyman Desktop (allowing us to, for instance, use kbdus.dll as a base layout for some Keyman keyboard layouts).  However, we have been unable to get it to work.

  1. It seems the parameter type in the documentation is not quite right.  This parameter is not a HKL that you can obtain from LoadKeyboardLayout or some such.  Instead it needs to be a KLID, a keyboard layout ID, such as 0x00010409 for US Dvorak.
  2. The base layout is limited to one linked to the language the TIP is being installed for.  This means you cannot, for instance, substitute German QWERTZ for a French TIP, even if you wanted to. (And we want to!)
  3. Despite these restrictions, the RegisterProfile function does not check the validity of the parameter and happily tells you that the profile registered OK.
  4. While GetSubstituteKeyboardLayout returns the exact value that you registered, ITfInputProcessorProfileMgr::GetProfile returns 0 for the hklSubstitute member unless you meet the preconditions described above.  Furthermore, hklSubstitute, when available, is not a HKL but a KLID.
  5. Finally, even after all this, it seems that the substitute keyboard layout does not apply.

I received some clarification from Microsoft (edited slightly, thank you Vladimir Smirnov):

In fact, the feature is not that generic as you apparently expect. Its whole purpose was to enable smooth deprecation of IMM32 IMEs and their replacement (substitution) with TSF IMEs (aka TIPs). It doesn’t support substituting an arbitrary plain keyboard layout with another layout or IME (let alone of a different language).

The hklSubstitute is supposed to be a real HKL, but for IMM32 IMEs it actually matches the IME’s KLID. There’s old verification code there which looks for a matching KLID in HKLM for a given hklSubstitute and ignores the latter if there’s no match. That’s why you get 0 from GetProfile for your invalid or unsupported hklSubstitute.

Generic substitution was just never meant to be supported in TSF.

It’s a shame, but we work with what we’ve got. So for Keyman Desktop 9, we wrote a mnemonic layout recompiler that recompiles a Keyman mnemonic layout keyboard, merging it with a given Windows system base layout, at keyboard installation time in Keyman Desktop.

Let’s Encrypt on Windows, redux

A couple of months ago I wrote a script to automatically renew Let’s Encrypt certificates in PowerShell on Windows. The renewal process works really well, however, there is one wrinkle that I did not cover. In this blog post I smooth out that wrinkle!

The wrinkle

When it came time to renew my certificate, a few days ago (well within the 90 day limit for the certificate, mind you), I discovered that the identifier for the certificate had expired. Why was this a problem? Well, I had originally used a manual DNS challenge to validate the identifier with Let’s Encrypt. This worked fine, but of course, manually creating a new identifier and challenge every 90 days completely undoes the benefit of automatic certificate renewal.

Smoothing out the wrinkle

In order to resolve this, I needed to automatically generate and validate an identifier at the same time as I generated a new certificate. The only automatic challenge provider at this time with ACMESharp is the http-01 provider with the IIS handler.

So I updated the script to generate the identifier automatically and validate with the http-01 challenge provider. However, the http-01 challenge provider requires a HTTP request, not a HTTPS request, to the hostname in question, because, and I quote the IETF draft here:

… Because many webservers allocate a default HTTPS virtual host to a particular low-privilege tenant user in a subtle and non-intuitive manner, the challenge must be completed over HTTP, not HTTPS.

I love subtle and non-intuitive computing!

But this was a problem for me, because the secure site I was working on did not have a http endpoint, only a https endpoint, which is the whole reason I used a DNS challenge in the first place.

I finally threw in the towel and I decided to setup a http endpoint, and configure automatic redirection to https (yes, I could go further here). You may find you need to setup an exception for automatic redirection for the .well-known/ root folder (where http challenges are kept as static files). I’ll leave that tweak for you to figure out (it’s just another line or two in your site root web.config).

A deeper wrinkle

The wrinkles get a little deeper, when you look at what happens if you already have a valid challenge response for an identifier. This could happen if you had manually validated another alias for the same identifier using e.g. dns, as I had in the past, or if you are renewing before your existing identifier expires. Because the challenge responses are matched to the hostname, and not the aliases, the previously valid responses continue to be acceptable. In this situation, the existing challenge response was used by the Let’s Encrypt server, and it never checked my shiny new web server challenge endpoint. This meant we needed to see if any existing challenge responses were considered valid, rather than relying on checking the challenge we’d just setup. The script changes handle this scenario.

Just read me the script

The full PowerShell script is shown below; the changes begin at the start of the Try block, and finish after the New-ACMECertificate call. More detail on how the script works is provided in the previous blog post.

import-module ACMESharp

#
# Script parameters
#

$domain = "my.example.com"
$iissitename = "my.example.com"
$certname = "my.example.com-$(get-date -format yyyy-MM-dd--HH-mm)"

#
# Environmental variables
#

$PSEmailServer = "localhost"
$LocalEmailAddress = "[email protected]"
$OwnerEmailAddress = "[email protected]"
$pfxfile = "c:\Admin\Certs\$certname.pfx"
$CertificatePassword = "PASSWORD!"

#
# Script setup - should be no need to change things below this point
#

$ErrorActionPreference = "Stop"
$EmailLog = @()

#
# Utility functions
#

function Write-Log {
  Write-Host $args[0]
  $script:EmailLog  += $args[0]
}

Try {
  Write-Log "Generating a new identifier for $domain"
  New-ACMEIdentifier -Dns $domain -Alias $certname
  
  Write-Log "Completing a challenge via http"
  Complete-ACMEChallenge $certname -ChallengeType http-01 -Handler iis -HandlerParameters @{ WebSiteRef = $iissitename }
  
  Write-Log "Submitting the challenge"
  Submit-ACMEChallenge $certname -ChallengeType http-01
  
  # Check the status of the identifier every 6 seconds until we have an answer; fail after a minute
  $i = 0
  do {
    $identinfo = (Update-ACMEIdentifier $certname -ChallengeType http-01).Challenges | Where-Object {$_.Status -eq "valid"}
    if($identinfo -eq $null) {
      Start-Sleep 6
      $i++
    }
  } until($identinfo -ne $null -or $i -gt 10)

  if($identinfo -eq $null) {
    Write-Log "We did not receive a completed identifier after 60 seconds"
    $Body = $EmailLog | out-string
    Send-MailMessage -From $LocalEmailAddress -To $OwnerEmailAddress -Subject "Attempting to renew Let's Encrypt certificate for $domain" -Body $Body
    Exit
  }
  
  # We now have a new identifier... so, let's create a certificate
  Write-Log "Attempting to renew Let's Encrypt certificate for $domain"

  # Generate a certificate
  Write-Log "Generating certificate for $domain"
  New-ACMECertificate $certname -Generate -Alias $certname

  # Submit the certificate
  Submit-ACMECertificate $certname

  # Check the status of the certificate every 6 seconds until we have an answer; fail after a minute
  $i = 0
  do {
    $certinfo = Update-AcmeCertificate $certname
    if($certinfo.SerialNumber -eq "") {
      Start-Sleep 6
      $i++
    }
  } until($certinfo.SerialNumber -ne "" -or $i -gt 10)

  if($i -gt 10) {
    Write-Log "We did not receive a completed certificate after 60 seconds"
    $Body = $EmailLog | out-string
    Send-MailMessage -From $LocalEmailAddress -To $OwnerEmailAddress -Subject "Attempting to renew Let's Encrypt certificate for $domain" -Body $Body
    Exit
  }

  # Export Certificate to PFX file
  Get-ACMECertificate $certname -ExportPkcs12 $pfxfile -CertificatePassword $CertificatePassword

  # Import the certificate to the local machine certificate store 
  Write-Log "Import pfx certificate $pfxfile"
  $certRootStore = "LocalMachine"
  $certStore = "My"
  $pfx = New-Object System.Security.Cryptography.X509Certificates.X509Certificate2
  $pfx.Import($pfxfile,$CertificatePassword,"Exportable,PersistKeySet,MachineKeySet") 
  $store = New-Object System.Security.Cryptography.X509Certificates.X509Store($certStore,$certRootStore) 
  $store.Open('ReadWrite')
  $store.Add($pfx) 
  $store.Close() 
  $certThumbprint = $pfx.Thumbprint

  # Bind the certificate to the requested IIS site (all https bindings)
  Write-Log "Bind certificate with Thumbprint $certThumbprint"
  $obj = get-webconfiguration "//sites/site[@name='$iissitename']"
  for($i = 0; $i -lt $obj.bindings.Collection.Length; $i++) {
    $binding = $obj.bindings.Collection[$i]
    if($binding.protocol -eq "https") {
      $method = $binding.Methods["AddSslCertificate"]
      $methodInstance = $method.CreateInstance()
      $methodInstance.Input.SetAttributeValue("certificateHash", $certThumbprint)
      $methodInstance.Input.SetAttributeValue("certificateStoreName", $certStore)
      $methodInstance.Execute()
    }
  }

  # Remove expired LetsEncrypt certificates for this domain
  Write-Log "Remove old certificates"
  $certRootStore = "LocalMachine"
  $certStore = "My"
  $date = Get-Date
  $store = New-Object System.Security.Cryptography.X509Certificates.X509Store($certStore,$certRootStore) 
  $store.Open('ReadWrite')
  foreach($cert in $store.Certificates) {
    if($cert.Subject -eq "CN=$domain" -And $cert.Issuer.Contains("Let's Encrypt") -And $cert.Thumbprint -ne $certThumbprint) {
      Write-Log "Removing certificate $($cert.Thumbprint)"
      $store.Remove($cert)
    }
  }
  $store.Close() 

  # Finished
  Write-Log "Finished"
  $Body = $EmailLog | out-string
  Send-MailMessage -From $LocalEmailAddress -To $OwnerEmailAddress -Subject "Let's Encrypt certificate renewed for $domain" -Body $Body
} Catch {
  Write-Host $_.Exception
  $ErrorMessage = $_.Exception | format-list -force | out-string
  $EmailLog += "Let's Encrypt certificate renewal for $domain failed with exception`n$ErrorMessage`r`n`r`n"
  $Body = $EmailLog | Out-String
  Send-MailMessage -From $LocalEmailAddress -To $OwnerEmailAddress -Subject "Let's Encrypt certificate renewal for $domain failed with exception" -Body $Body
  Exit
}

Side note: I discovered it’s important to let ACMESharp do its thing in the script and not try and do anything with it in another process because it tends to fall over with an exception if some other process is accessing its vault when it wants to.

Another minor update (18 Feb 2017): my $alias in my live script happened to be the same as my $domain; if they differed, then the script would fail. The $alias variable is no longer needed and has been removed from the script above. Thank you to BdN3504 for reporting this in the comments!

Automating certificate renewal with Let’s Encrypt and ACMESharp on Windows

 

UPDATE: 10 February 2017! I’ve updated the script in a new blog post to handle identifer expiry. 

 

Let’s Encrypt is a free, automated, and open Certificate Authority. And it is awesome. It is being used by over 15 million domains already to date.

le-logo-standard

Let’s Encrypt is a certificate authority. So that means that they issue certificates, specifically for secure https (TLS) websites. Lots of other organisations do this as well. But two things stand out about Let’s Encrypt. First, it’s free! Given that I was paying over $100/year for a certificate for one of my sites until recently, that’s a big win already. The second is, it’s automated!

The automated bit cannot be understated. It means that the first time I use Let’s Encrypt, I have to do a bunch of setup. But from then on, I no longer have to remember the arcane and complicated process of generating a certificate request, uploading it to a CA, waiting for the CA to process the request, and finally importing the certificate along with all the incidentals such as intermediate certificates.

However, the one thing about Let’s Encrypt that has stopped me using it so far is that I run some of my sites on IIS on Windows, but Let’s Encrypt is very *nix-focused. While there are clients for Windows, none of them are very complete and so it’s been a bit of hit and miss using them.

The most up-to-date client/library that I have found appears to be ACMESharp. This library is a PowerShell module, and while there is a GUI front end available, I haven’t used the GUI. I’ve worked entirely with the PowerShell module. ACMESharp is pretty flexible and covers everything I need, except one thing: renewals. It has no built-in automated renewal support. So I rolled my own.

I did most of my work in the Let’s Encrypt staging environment, after foolishly starting in the live environment and rapidly hitting the duplicate certificate rate limit. I recommend you do your testing in the staging environment also!

The easiest way to work in the staging environment is to setup a separate vault profile for ACMESharp. I ended up using my :user profile for staging and my :sys profile for the live host. To specify which vault profile you want to use, it’s best to use an environment variable, as otherwise you’ll inevitably forget to append the -VaultProfile parameter to one of your setup commands and leave yourself in a bit of a mess:

$env:ACMESHARP_VAULT_PROFILE=":user"

What follows is a script setup for my servers, but which should work for most Windows PowerShell scenarios. I have set it up to email me on success or failure; I’ll be watching it over the next little while to ensure that it gets things right. I have set it up as a scheduled task to run every 60 days, per Let’s Encrypt’s recommendation.

The script assumes you have already followed the ACMESharp Quick Start to configure your environment. The variables at the top of the script could be configured as script parameters, but for simplicity I’ve just put them at the top of the script. The script will request a new certificate, import the newly issued certificate into the localmachine certificate store, then assign it to all https bindings on the specified web site instance. Finally, it will delete all expired certificates associated with the domain in question.

UPDATE: 10 February 2017! Please see the revised script!

import-module ACMESharp

#
# Script parameters
#

$domain = "my.example.com"
$alias = "my.example.com-01"
$iissitename = "my.example.com"
$certname = "my.example.com-$(get-date -format yyyy-MM-dd--HH-mm)"

#
# Environmental variables
#

$PSEmailServer = "localhost"
$LocalEmailAddress = "[email protected]"
$OwnerEmailAddress = "[email protected]"
$pfxfile = "c:\Admin\Certs\$certname.pfx"
$CertificatePassword = "PASSWORD!"

#
# Script setup - should be no need to change things below this point
#

$ErrorActionPreference = "Stop"
$EmailLog = @()

#
# Utility functions
#

function Write-Log {
  Write-Host $args[0]
  $script:EmailLog  += $args[0]
}

Try {
  Write-Log "Attempting to renew Let's Encrypt certificate for $domain"

  # Generate a certificate
  Write-Log "Generating certificate for $alias"
  New-ACMECertificate ${alias} -Generate -Alias $certname

  # Submit the certificate
  Submit-ACMECertificate $certname

  # Check the status of the certificate every 6 seconds until we have an answer; fail after a minute
  $i = 0
  do {
    $certinfo = Update-AcmeCertificate $certname
    if($certinfo.SerialNumber -ne "") {
      Start-Sleep 6
      $i++
    }
  } until($certinfo.SerialNumber -ne "" -or $i -gt 10)

  if($i -gt 10) {
    Write-Log "We did not receive a completed certificate after 60 seconds"
    $Body = $EmailLog | out-string
    Send-MailMessage -From $LocalEmailAddress -To $OwnerEmailAddress -Subject "Attempting to renew Let's Encrypt certificate for $domain" -Body $Body
    Exit
  }

  # Export Certificate to PFX file
  Get-ACMECertificate $certname -ExportPkcs12 $pfxfile -CertificatePassword $CertificatePassword

  # Import the certificate to the local machine certificate store 
  Write-Log "Import pfx certificate $pfxfile"
  $certRootStore = "LocalMachine"
  $certStore = "My"
  $pfx = New-Object System.Security.Cryptography.X509Certificates.X509Certificate2
  $pfx.Import($pfxfile,$CertificatePassword,"Exportable,PersistKeySet,MachineKeySet") 
  $store = New-Object System.Security.Cryptography.X509Certificates.X509Store($certStore,$certRootStore) 
  $store.Open('ReadWrite')
  $store.Add($pfx) 
  $store.Close() 
  $certThumbprint = $pfx.Thumbprint

  # Bind the certificate to the requested IIS site (all https bindings)
  Write-Log "Bind certificate with Thumbprint $certThumbprint"
  $obj = get-webconfiguration "//sites/site[@name='$iissitename']"
  for($i = 0; $i -lt $obj.bindings.Collection.Length; $i++) {
    $binding = $obj.bindings.Collection[$i]
    if($binding.protocol -eq "https") {
      $method = $binding.Methods["AddSslCertificate"]
      $methodInstance = $method.CreateInstance()
      $methodInstance.Input.SetAttributeValue("certificateHash", $certThumbprint)
      $methodInstance.Input.SetAttributeValue("certificateStoreName", $certStore)
      $methodInstance.Execute()
    }
  }

  # Remove expired LetsEncrypt certificates for this domain
  Write-Log "Remove old certificates"
  $certRootStore = "LocalMachine"
  $certStore = "My"
  $date = Get-Date
  $store = New-Object System.Security.Cryptography.X509Certificates.X509Store($certStore,$certRootStore) 
  $store.Open('ReadWrite')
  foreach($cert in $store.Certificates) {
    if($cert.Subject -eq "CN=$domain" -And $cert.Issuer.Contains("Let's Encrypt") -And $cert.Thumbprint -ne $certThumbprint) {
      Write-Log "Removing certificate $($cert.Thumbprint)"
      $store.Remove($cert)
    }
  }
  $store.Close() 

  # Finished
  Write-Log "Finished"
  $Body = $EmailLog | out-string
  Send-MailMessage -From $LocalEmailAddress -To $OwnerEmailAddress -Subject "Let's Encrypt certificate renewed for $domain" -Body $Body
} Catch {
  Write-Host $_.Exception
  $ErrorMessage = $_.Exception | format-list -force | out-string
  $EmailLog += "Let's Encrypt certificate renewal for $domain failed with exception`n$ErrorMessage`r`n`r`n"
  $Body = $EmailLog | Out-String
  Send-MailMessage -From $LocalEmailAddress -To $OwnerEmailAddress -Subject "Let's Encrypt certificate renewal for $domain failed with exception" -Body $Body
  Exit
}

I guess I’ll find out in 60 days if the script still works! With many thanks to the users of StackOverflow, and other bloggers, for working code samples which saved me a lot of time reading reference documentation for so many of the different bits of glue here, from exception management, through to sending emails with PowerShell, through to assigning certificates to IIS website bindings, and more…

Update 2 December 2016: The original script had a bug that didn’t affect use with IIS. However, when I tried to use the Let’s Encrypt certificate with another program, in this case MailEnable, I found that the program did not have access to the certificate, even though it seemed it should have had.

When I started the relevant MailEnable service, it would display an error such as:

12/02/16 20:13:00 **** Error 0x8009030d returned by AcquireCredentialsHandle
12/02/16 20:13:00 **** Error creating credentials object for SSL session
12/02/16 20:13:00 Unable to locate or bind to certificate with name "my.example.com"

I checked the permissions and various other factors but only when I did a deep comparison of the details of a working certificate against the one that was failing, using certutil, as suggested in that blog linked above, did I spot the problem. The problem lay in the following CRYPT_KEY_PROV_INFO structure:

CERT_KEY_PROV_INFO_PROP_ID(2):
    Key Container = {XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}
  Unique container name: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
    Provider = Microsoft Enhanced Cryptographic Provider v1.0
    ProviderType = 1
    Flags = 0
    KeySpec = 1 -- AT_KEYEXCHANGE

When I compared this against a working certificate, I saw that the Flags member had a value of 20 (hex), not 0. 0x20 turns out to be CRYPT_MACHINE_KEYSET. Because I was missing the MachineKeySet flag in the certificate import call, this meant that the key was stored under the Administrator user’s keyset instead of the machine keyset. IIS coped with this, but not MailEnable, which runs under a more restricted user’s credentials.

I have also updated the script to delete all Let’s Encrypt certificates for the domain that have been obsoleted by the new certificate, rather than just certificates that have expired (-And $cert.NotAfter -lt $date), mostly to avoid the risk of accidentally selecting an old certificate manually when doing configuration via UI.

On my servers, I’ve also added a section to the end of the script that restarts various services that depend on the certificate and will not use a new certificate until after restarting.

Don’t forget to navigate to about:blank when embedding IWebBrowser2

Today I spent several hours trying to figure out why an embedded web browser component (in this case TEmbeddedWB) in a Delphi test app never received the appropriate IHttpSecurity and IWindowForBindingUI QueryService requests.

I was doing this in order to provide more nuanced handling of self-signed certificates in an intranet context. We all do this, right? Here the term “nuanced” means “Of course I trust self signed certificates on my intranet, don’t you?” Feel free to rant and rave on this. 😉

But no matter what I did, what incantations I tried, or what StackOverflow posts I perused, I was unable to find an answer. Until finally I stumbled on a side comment in a thread from 2010. Igor Tandetnik notes that:

Right after creating the control, navigate it to about:blank. Right after that, navigate it to the page you wanted to go to. It’s a known problem that IServiceProvider doesn’t work for the very first navigation.

And this was something that I kinda knew in the back of my head, but of course had forgotten. Thank you Igor.

This post would not be complete without some splendiferous code. Just for reference, it’s so simple if you don’t blank out and forget about:blank.

unit InsecureBrowser;

interface

uses
  Winapi.Windows,
  Winapi.Messages,
  Winapi.Urlmon,
  Winapi.WinInet,
  System.SysUtils,
  System.Variants,
  System.Classes,
  Vcl.Graphics,
  Vcl.Controls,
  Vcl.Forms,
  Vcl.Dialogs,
  Vcl.OleCtrls,
  Vcl.StdCtrls,
  SHDocVw_EWB,
  EwbCore,
  EmbeddedWB;

type
  TInsecureBrowserForm = class(TForm, IHttpSecurity, IWindowForBindingUI)
    web: TEmbeddedWB;
    cmdGoInsecure: TButton;
    procedure webQueryService(Sender: TObject; const [Ref] rsid,
      iid: TGUID; var Obj: IInterface);
    procedure FormCreate(Sender: TObject);
    procedure cmdGoInsecureClick(Sender: TObject);
  private
    { IWindowForBindingUI }
    function GetWindow(const guidReason: TGUID; out hwnd): HRESULT; stdcall;

    { IHttpSecurity }
    function OnSecurityProblem(dwProblem: Cardinal): HRESULT; stdcall;
  end;

var
  InsecureBrowserForm: TInsecureBrowserForm;

implementation

{$R *.dfm}

function TInsecureBrowserForm.GetWindow(const guidReason: TGUID;
  out hwnd): HRESULT;
begin
  Result := S_FALSE;
end;

function TInsecureBrowserForm.OnSecurityProblem(dwProblem: Cardinal): HRESULT;
begin
  if (dwProblem = ERROR_INTERNET_INVALID_CA) or
     (dwProblem = ERROR_INTERNET_SEC_CERT_CN_INVALID)
    then Result := S_OK
    else Result := E_ABORT;
end;

procedure TInsecureBrowserForm.webQueryService(Sender: TObject;
  const [Ref] rsid, iid: TGUID; var Obj: IInterface);
begin
  if IsEqualGUID(IID_IWindowForBindingUI, iid) then
    Obj := Self as IWindowForBindingUI
  else if IsEqualGUID(IID_IHttpSecurity, iid) then
    Obj := Self as IHttpSecurity;
end;

procedure TInsecureBrowserForm.cmdGoInsecureClick(Sender: TObject);
begin
  web.Navigate('https://evil.intranet.site/');
end;

procedure TInsecureBrowserForm.FormCreate(Sender: TObject);
begin
  web.Navigate('about:blank');
end;

end.

 

Windbg and Delphi – a quick reference

This is a list of my blog posts on using WinDBG with Delphi apps, mostly for my reference. Internally, I use a version of tds2dbg with some private modifications, but you should have reasonable results with the public version.

WinDBG and Delphi exceptions

WinDBG and Delphi exceptions in x64

Locating Delphi exceptions in a live session or dump using WinDbg

Debugging a stalled Delphi process with Windbg and memory searches

Finding class instances in a Delphi process using WinDbg

More windbg tricks with Delphi – how to ignore specific exceptions (Jan 2016)

I also have some other posts that talk about WinDBG and/or Delphi which can be helpful for illustrative purposes:

Another partial malware diagnosis

Detecting the Citadel Trojan with an Application Failure

When characters go astray: diagnosing missing characters when printing with IE9

WaitForSingleObject. Why you should never use it.

IE11, Windbg, JavaScript = joy

 

 

 

 

Windows 10 shows a blank screen after login

Quick post for future reference and in case it helps other users.

Got to work this morning, and couldn’t login to my Windows 10 machine. Windows Updates had been installed overnight (oh dear).

The symptoms were slightly different to most of the other reports I’ve seen online (Google Windows 10 blank screen after login). I could login on one account all fine, but my primary account would just go to a black screen (with cursor). Waited quite some time (10 minutes or more) with no change. Rebooted. No change.

Pressing Ctrl+Alt+Del would bring up Windows Security, and from there I could switch user or go back to the lock screen, but attempting to log out would fail: the “wait for logout” spin appeared but never logged out. Other options from that screen had no effect.

I vaguely tried multi monitor options (Windows+P key combination), typing password into the blank screen, and other magic mummery without effect (and without expectation of success).

Eventually I logged in on the working account, turned off Windows firewall, enabled Remote Registry, and got to work with SysInternals tools. I logged back out of that account and used the command pslist \\machine -t from a working computer to see what processes were running.


pslist v1.3 - Sysinternals PsList
Copyright (C) 2000-2012 Mark Russinovich
Sysinternals - www.sysinternals.com

Process information for <machine>:

Name                             Pid Pri Thd  Hnd      VM      WS    Priv
Idle                               0   0   4    0      64       4       0
  System                           4   8 141 1217  393012  271644    1560
    smss                         316  11   3   49    4864     592     376
csrss                            428  13  11  415  163348    3240    1620
wininit                          500  13   4   91   43984    2792    1232
  services                       572   9   9  294   35020    7900    4648
    svchost                      304   8  35  956  184400   23016   13240
    svchost                      356   8  17  321   82168    6432    3248
    svchost                      496   8  26 1043  204540   24268   20616
    svchost                      748   8  30  773   87432   16040   10172
      ShellExperienceHost       7032   8  44  876  369236   35880   58392
      WmiPrvSE                  8876   8   6  144   33264    8172    1880
      SearchUI                  9820   8  27  900  362948   37788   48896
    svchost                      812   8  18  528   59364   11544    9764
    svchost                      936   8  52 1346 1417096   26592   22524
    svchost                      968   8  78 4722 1321944   71912  797808
      taskhostw                 3952   8   8  291  275572   10268    6492
    svchost                     1064   8  37 1130  719392   25880   12960
      WUDFHost                  1816   8   8  476   38964    3852    2212
      dasHost                   2084   8  13  302   37880    7968    3764
    mvbtrcsvcx64                1108   8   3  107   58624    1608    1124
    svchost                     1344   8  14  527  217200   29352   56716
    svchost                     1348   8  29  605  167772   23016   18484
    spoolsv                     1792   8  24  576   91808   13972   11384
    svchost                     2156   8  48  485 1200004   19240    9308
    MsMpEng                     2268   8  45 1122  523848  127360  144964
    svchost                     2684   8  15 3980  303696   53068   95100
    NisSrv                      3084   8  10  287   71416   10276   23812
    SearchIndexer               5800   8  50 1045  434516   77688   95752
      SearchProtocolHost        9568   4  10  353   74552   12456    2240
      SearchFilterHost          9856   4   4  114   35096    6560    1208
    officeclicktorun            8500   8  18 3570  213028   27932   48116
    svchost                    11928   8   5  128   23372    6500    1276
  lsass                          580   9  14 1272   58540   17008    9876
csrss                           3356  13  15  634  160220   16276    2520
winlogon                        3396  13   6  229   65460   12380    2472
  dwm                           3520  13  12  332  282440   34416   62088

Nothing obviously wrong that I could see. I then ran pskill \\machine 3396 to kill winlogon, in the hope that at least that would cause the account to log out.

And of course it did.

The surprise was that after that I was able to login…

Unfortunately, I have not yet traced what caused the lock but at least this was an answer for me, which hopefully may help someone else one day.

Possibly related: I did find this in the Event Log:

Application pop-up: ShellExperienceHost.exe – Application Error : The instruction at 0x00007FFE569D50CB referenced memory at 0x0000000000000000. The memory could not be read.

And, an unhelpful error from win32k:

The description for Event ID 267 from source Win32k cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

The specified resource type cannot be found in the image file

The case of the unexplained: When no Windows apps (aka Windows Store apps) will start

Yes, I’m shamelessly stealing @MarkRussinovich‘s blog series title for this post!

One of my machines running Windows 10 here would not run any Windows apps (formerly known as Universal apps, Metro, Modern UI and I’m not sure if I’ve missed any names). Classic desktop apps would work fine.

Finding Microsoft Edge in the Start Menu

I’d click the link to the app in the Start Menu (those missing names may be another case to chase!), and the app would flash onto the screen and then almost immediately disappear.

Microsoft Edge starts and immediately exits

Of course this was more than a little bit frustrating, with no hints as to how to resolve the problem. Checking event logs and reliability provided no pointers towards solutions.

Reliability Monitor is aware of the problem but doesn't know how to fix it

After a couple of pointless web searches (“Edge won’t start”, what was I thinking?), and a bizarre side trip into deep conspiracy theories on Microsoft forums, I realised it was time to break out Procmon to try and trace the problem.

Procmon to the rescue

Procmon lets you watch and log events happening on your file system, registry and network in real time. Running Procmon for even just a minute will often generate hundreds of thousands of events, so it’s fantastic that it includes a powerful set of filtering tools to help you locate specific events.

I started Procmon, and then started, or tried to start Microsoft Edge. After it fell over again, I went back into Procmon, stopped the trace (Ctrl+E), and started to filter the 452,626 events that had been captured in those few seconds.

Initial results in procmon

Procmon’s initial setup includes some filters that exclude events that are of less interest to mere mortals, such as reading and writing to the pagefile, or events caused by Procmon itself. Those default filters cut the results down by 55% to begin with!

While you can use the Filter dialog (Ctrl+L) to manually enter filters, and I often do this for complex filtering, it’s often faster to simply right-click on a cell that you don’t want to see again, and select Exclude <value> from the popup menu. Conversely, if you want to focus on that particular value, select Include <value>.

First, I excluded some processes I wasn’t interested in, such as Explorer.exe, and then excluded a number of different values from the Result column. I was really looking for the ACCESS DENIED result, because that’s probably the most common result that causes apps to crash. I ended up with the following filters on the Result column:

Filtering results to find the errors I am looking for in Procmon

Now, there were few enough events (only 1,625 of them) that I could scan through quickly and hopefully spot something going wrong. And, again Procmon found the answer!

Finding the point where Microsoft Edge encounters an error in Procmon

There you can see MicrosoftEdge.exe receiving an ACCESS DENIED result when trying to read the folder C:\Users\mcdurdin\AppData\Local\Packages\Microsoft.MicrosoftEdge_8wekyb3d8bbwe\AC\Microsoft\Windows\1605653898. Shortly thereafter, we see the WerFault.exe process which is the Windows Error Reporting process that started after Edge decided to crash.

Note: it’s merely serendipitous that WerFault.exe is visible in the filtered results; remember that there are probably thousands of additional events between the highlighted ACCESS DENIED event and the start of the WerFault.exe process, and the only reason it is visible at all is that WerFault itself had received ACCESS DENIED and other results from its own events!

I could alternatively have looked at the process tree (Ctrl+T) to find when the WerFault.exe process started (or the MicrosoftEdge.exe process had stopped) and traced back from there. But usually I find filtering to be a faster way of finding the specific issue.

What’s wrong with this folder?

Now I wanted to figure out what was wrong with this folder. Here’s what I saw on this machine:

C:\Users\mcdurdin\AppData\Local\Packages>icacls Microsoft.MicrosoftEdge_8wekyb3d8bbwe\AC
Microsoft.MicrosoftEdge_8wekyb3d8bbwe\AC NT AUTHORITY\SYSTEM:(I)(OI)(CI)(F)
                                         BUILTIN\Administrators:(I)(OI)(CI)(F)
                                         TAVULTESOFT\mcdurdin:(I)(OI)(CI)(F)

Successfully processed 1 files; Failed processing 0 files

And this is what I saw on a machine where Edge was working:

C:\Users\mcdurdin\AppData\Local\Packages>icacls Microsoft.MicrosoftEdge_8wekyb3d8bbwe\AC
Microsoft.MicrosoftEdge_8wekyb3d8bbwe\AC S-1-15-2-3624051433-2125758914-1423191267-1740899205-1073925389-3782572162-737981194:(OI)(CI)(F)
                                         NT AUTHORITY\SYSTEM:(I)(OI)(CI)(F)
                                         BUILTIN\Administrators:(I)(OI)(CI)(F)
                                         TAVULTESOFT\mcdurdin:(I)(OI)(CI)(F)
                                         Mandatory Label\Low Mandatory Level:(OI)(CI)(NW)

Successfully processed 1 files; Failed processing 0 files

I saw two differences: a missing S-1-15-2-… entry and a missing Low Mandatory Level entry. Now, that S-1-15-2-… entry is an App Package SID. I checked a few other installed app packages, and they were all missing the relevant security settings on this machine. So it wasn’t specific to Edge, but was a general issue on my computer.

At this point, I did find a relevant discussion on Microsoft’s forums that had some answers, but did not solve the general case that I was experiencing.

I  have not been able to find the root cause of this. Lost in the deep dark mists of time it is.

Fixing the problem

To fix the Low integrity level was a pretty straightforward command, run from a command prompt in the %LOCALAPPDATA%\Packages folder:

for /d %d in (*) do icacls %d\AC /setintegritylevel (OI)(CI)L

However, determining the correct SID to add to each folder was a little more work. It turns out that in the registry, there is a mapping between the app’s moniker (so, in this case, the folder names) and the relevant SID at HKEY_CLASSES_ROOT\Local Settings\Software\Microsoft\Windows\CurrentVersion\AppContainer\Mappings. Learn more.
The AppContainer Mappings registry, showing the relationship between SID and Moniker
I sucked a list of those SIDs into a text file with the following command:

reg query "HKCR\Local Settings\Software\Microsoft\Windows\CurrentVersion\AppContainer\Mappings" > m.txt

That gave me a file that looked like this, shown as an image just because:

The contents of m.txt

From there, I wanted to extract just the last part of each key:

for /f "tokens=9 delims=\" %i in (m.txt) do echo %i >> n.txt

Now n.txt looked like:

The contents of n.txt

Now to take each of those and map it to its moniker, and from there update the security on the folder accordingly. That command turned out to be a bit more hoopy.

for /f %a in (n.txt) do for /f "tokens=2*" %b in ('reg query "HKCR\Local Settings\Software\Microsoft\Windows\CurrentVersion\AppContainer\Mappings\%a" /V "Moniker" 2^>NUL ^| FIND "REG_SZ"') DO icacls "%c\AC" /grant *%a:(OI)(CI)(F)

Putting that all together, in a batch file (I’ve combined the integrity level setting and SID grant in this script):

@echo off
del n.txt
reg query "HKCR\Local Settings\Software\Microsoft\Windows\CurrentVersion\AppContainer\Mappings" > m.txt
for /f "tokens=9 delims=\" %%i in (m.txt) do echo %%i >> n.txt
for /f %%a in (n.txt) do for /f "tokens=2*" %%b in ('reg query "HKCR\Local Settings\Software\Microsoft\Windows\CurrentVersion\AppContainer\Mappings\%%a" /V "Moniker" 2^>NUL ^| FIND "REG_SZ"') DO icacls "%%c\AC" /grant *%%a:(OI)(CI)(F) /setintegritylevel (OI)(CI)L

And yay, now Edge starts!

Yay, Microsoft Edge starts now!

One of these days I’ll have to get more into PowerShell, which would probably make some of these scripts a lot easier!

With thanks to examples from http://www.robvanderwoude.com/ntregquery.php which saved a lot of fuss (but warning if you copy and paste: the examples use the wrong caret character, ˆ instead of ^).

GUI Info Utility for Windows

A very quick one today. GUIInfo is a tiny little utility that has a narrow purpose: it shows, in realtime, the current active, focus, foreground and capture window information for Windows.

Why this utility? Debugging focus issues is always frustrating. Attempting to observe current window focus information in a debugger results in the focus changing (to the debugger, of course). You can work around this – use remote debugging, or add logging in the debugger – but it’s a hassle.

Debugging focus issues would make Werner Heisenberg feel right at home.

Screenshot

GUIInfo Screenshot

Features

  • Updates 10 times a second so changes are reflected promptly.
  • Changes are highlighted in green, and slowly fade back to black.
  • Hovering over bold window labels will highlight the relevant window on the screen.
  • Topmost, so you can see the information without needing to try and keep the window visible.
  • Open Source – written in Delphi, MIT license.

Download and Source

Fixing the incorrect client size for Delphi VCL Forms that use styles

Delphi XE2 and later versions have a robust theming system that has a frustrating flaw: the client width and height are not reliably preserved when the theme changes the border widths for dialog boxes.

For forms that are sizeable this is not typically a problem, but for dialogs laid out statically this can look really ugly, as shown in this Stack Overflow question.

The problem in pictures

Here’s a little form, shown in the Delphi form designer. I’ve placed 4 buttons right in the corners of the form. I’m going to populate the Memo with notes on the form size at runtime.

Design time form with four buttons at corners

When I have no custom style set to the project (i.e. “Windows” style), I can run on a variety of platforms and see the buttons are where they should be. Shown here on Windows 10, Windows 7 and Windows XP (just because):
Windows theme form on Windows 10Windows theme form on Windows 7

Windows theme form on Windows XP

But when I apply a custom style to the project — I chose “Glossy” — then my dialog appears like so, instead:

Glow theme form on Windows 7

You’ll note that the vertical is adjusted but the horizontal is not: Button2 and Button4 are now chopped off on the right. Because we are using themes, the form looks identical on all platforms.

This problem has not been addressed as of Delphi XE8.

The workaround

For my needs, I found a workaround using a class helper, which can be applied to the forms which need to maintain their design-time ClientWidth and ClientHeight. This is typically the case for dialog boxes.

This workaround should be used with care as it has been designed to address a single issue and may have side effects.

  • It will trigger resize events at load time
  • Setting AutoScroll = true means that ClientWidth and ClientHeight are not stored in the form .dfm, and so this does not work.
  • It may not address other layout issues such as scaled elements scaling wrongly (I haven’t tested this).
type
  TFormHelper = class helper for Vcl.Forms.TCustomForm
  private
    procedure RestoreDesignClientSize;
  end;

procedure TfrmTestSize.FormCreate(Sender: TObject);
begin
  RestoreDesignClientSize;
end;

{ TFormHelper }

procedure TFormHelper.RestoreDesignClientSize;
begin
  if BorderStyle in [bsSingle, bsDialog] then
  begin
    if Self.FClientWidth > 0 then ClientWidth := Self.FClientWidth;
    if Self.FClientHeight > 0 then ClientHeight := Self.FClientHeight;
  end;
end;

After adding in this little snippet, the form is now restored to its design-time size, like thus:

Fixed glow theme form on Windows 7

Success 🙂

The case of the UAC that Just Wouldn’t

One of my dev machines has long had a weird anomaly where file operations in Explorer that should prompt for UAC, such as copying a file into C:\Program Files, would instead silently fail.

This led to all sorts of issues, from being unable to delete certain files — they’d just obstinately sit there, no matter how much I pressed that Del key — to trying to move folders containing a hidden Thumbs.db file and being unable to move the folder.

My UAC settings were the Windows defaults. Nothing weird these. So I’d always treated put this issue into my “too busy to solve this now” basket. The classic basket case. But today I finally got fed up.

After a quick search for the symptoms on Dr Google returned no results of significance, I decided I needed to trace the cause myself.

Process Monitor to the Rescue

It was time to pull out Process Monitor out of my toolbox again! Process Monitor is a tool from the SysInternals Suite by Microsoft that monitors and logs details on a bunch of different operations on your computer. I use Process Monitor, Procmon for short, all the time to solve problems big and little. But for some reason, it hadn’t crossed my mind until today that I could apply Procmon to this problem.

First, I configured Procmon to filter all events except for those generated by Explorer.exe and Consent.exe. I wasn’t sure if Consent.exe was involved in the problem (Consent.exe being the UAC elevation prompter), but it wouldn’t hurt to include it to start with. Note that all those Exclude filters are default filters setup by Procmon to exclude itself and its friends, removing that confusion from the logs.

Procmon filter

Then I went ahead and tried to copy a file into C:\Program Files (x86). It was just an innocent little text file, but Explorer of course acted like a Buckingham Palace Guard and silently and stolidly ignored its existence.

Source folder  ➔  Dest folder

I used the clipboard Ctrl+C and Ctrl+V to copy and paste (or attempt to paste) the file. I didn’t think the clipboard was at fault because all other UAC-required file operations also failed silently. I could have dragged and dropped, it would have had the same effect.

But now, with procmon, I had captured the communication that went on behind the scenes. All those secret coded winks and nose scratches that told Explorer to fob off any attempts to trigger a UAC prompt. Here’s what I was presented with in the Procmon log.

Procmon start

I searched for the name of my text file (test.txt), and used Procmon’s Highlight tool to highlight every reference to it in the Path column. This made it easy to spot nearby interactions that may have been related, even if they didn’t directly reference the test.txt file itself. You can see below two of the highlighted test.txt lines.

Procmon highlighting

Because there was a lot going on, I filtered out a lot of Operations that I thought were not relevant, such as CloseFile, RegCloseKey, RegQueryKey, ReadFile and WriteFile, among others. This reduced the log considerably and made it easier to spot differences (my screen capture below shows the filtering after it was reset, however — I forgot to capture the filtered trace, sorry).

I decided to also capture a trace on a machine where UAC prompting worked. I then compared the two logs. After scrolling back and forth around the many references to test.txt, I saw that on my dev machine, there was an additional interaction, right before the point where the prompt dialog was presented:

TortoiseShell in Procmon

That’s right, I had a program called TortoiseCVS installed on this machine which hooked into Explorer in a variety of ways. After the FileOperationPrompt references on my second machine, there was no reference to TortoiseCVS. Here’s what it looked like on the other machine:

Procmon on clean machine without TortoiseShell

That was the only visible difference of significance in the logs.

Now for those of you who just knee-jerked into “why on earth are you using CVS?!?”, calm down! This is a story, and I’m telling the story.

I decided that I didn’t really need TortoiseCVS installed and decided to try uninstalling it.

Uninstall Tortoise CVS

Sadly, uninstalling it required a reboot, no doubt to remove its old fashioned hooks into Explorer.

After the reboot, I tried to copy my innocent little text file again.

CopyWorks

Success! I was now presented with the prompt I wanted!

Another case closed thanks to SysInternals Suite and Mark Russinovich!