Category Archives: Development

Notes on a Khmer mobile keyboard for Keyman

While other Khmer keyboards exist for iOS and Android, I wanted to try playing with one myself, given I am currently learning Khmer. Creating a keyboard layout is a great way to rapidly become very familiar with a script!

Keyman Developer makes it easy to play around with the layout of a keyboard and rapidly iterate the design. I was able to turn the original desktop layout into a mobile-optimised layout in under two hours, using only the visual editor.

keyman-developer

The base keyboard comes from the khmer10 Keyman keyboard created by Andrew Cunningham, which is based on the NiDA Khmer keyboard layout. This is a desktop keyboard, which follows a phonetic-style input, with letters placed as far as possible on keys with a similar sound in the English alphabet. For example, ក is on the [k] key.

Design principles

I had a number of goals I wanted to achieve with this keyboard.

A design good for a language learner

I wanted the design to help me remember the script, the sounds and related letters. This may not be optimal for a person fluent in the language, but for me, the NiDA layout’s phonetic-style layout was a good starting point.

Reduce the number of keys …

As the original design is for a desktop keyboard, there are too many keys to fit on a normal mobile layout. As a mobile keyboard should ideally have no more than ten keys in a row, I started with reducing that as one goal.

… But don’t lose all relationship with the desktop keyboard

I tried to avoid moving keys around on the keyboard, or removing keys other than the ones described below. This way, once I do become familiar with the mobile layout, it is not a difficult transition to using the NiDA layout on a desktop computer.

Move symbols and numbers off base layers

A number of non-alphabetic symbols are placed in a seemingly haphazard fashion on both the unshifted and Shift layers. The position of these symbols is probably pragmatic – the keys were available and not used for any other purpose. On a touch layout, we don’t need to maintain this because we can have as many or as few keys as we wish.

I moved all non-alphabetic symbols and numbers to a numeric/symbol layer.

symbol-numeric-layer

Relate sub consonants to base consonants

ក្ក  ក្ខ  ក្គ  ក្ឃ  ក្ង  ក្ច  ក្ឆ  ក្ជ  ក្ឈ  ក្ញ  ក្ដ  ក្ឋ  ក្ឌ  ក្ឍ  ក្ណ  ក្ត  ក្ថ  ក្ទ  ក្ធ  ក្ន  ក្ប  ក្ផ  ក្ព  ក្ភ  ក្ម  ក្យ  ក្រ  ក្ល  ក្វ  ក្ឝ  ក្ឞ  ក្ស  ក្ហ  ក្ឡ  ក្អ

The original keyboard uses the key [j] as a prefix to create a sub consonant, by emitting the Unicode U+17D2 sub consonant marker character. This is a little obscure, and meant that the shapes of sub consonants were never visible on the keyboard.

I wanted to hide this encoding-based knowledge. I have added the sub consonants to the keyboard under a longpress menu for each base consonant, and added a rule to delete both the consonant and the prefix U+17D2 marker together when backspace is pressed. Thus, the average keyboard user need never know about the existence of U+17D2.

longpress-sub-consonant

Independent vowels

As these are less frequently typed than the dependent vowels, they really didn’t need their own key on the keyboard. Again, from a language learner point of view, placing the independent vowels under the related dependent vowel symbol, as longpress menus, helps me to learn the shapes more rapidly. It also means that I don’t confuse the independent vowel shapes with similarly shaped consonants on the layout.

Adding missing characters

The Khmer digits were already on the keyboard, but the Hindu-Arabic numerals were not. I added these as long-press under the Khmer digit keys on the numeric/symbol layer.

Things that are not yet right

This keyboard is nowhere near finished, but it’s now at a good point to start using it, before making further optimisations. After using it for a while, I will have a clearer understanding of what is uncomfortable or awkward to use, and will also probably have better ideas of how to resolve the issues.

Some of the issues I already know about are:

  • Still too many keys per row: some rows have 12 keys. I should try to reduce this to 10 keys.
  • There are vowel combinations that may or may not be necessary. These do not render correctly on most operating systems on the keyboard, but do when used in writing.
  • The backspace key should be either on the top row, or on the third row. It’s only on the second row at present because that was where there was space!
  • The keyboard is missing a number of symbols that I probably won’t be using for now, but should be available for a complete solution, such as the Lek Attak numeric divination symbols. Some archaic letters are also not yet present.
  • The Khmer OS fonts do not use the dotted circle for isolated diacritics, which makes them hard to read on the keyboard.
  • The layers are not precisely the same width, which leads to slightly disconcerting movement on layer switches.
  • The shift key on the shift layer is missing its icon.
  • I just realised the independent vowels are currently missing – I need to re-add those soon!

Screenshot on Android:

Issues with using the keyboard include:

  • iOS has rendering bugs with Khmer, for example ខ្ញុំ on iOS overlaps the two subscript marks.

khmer-on-ios-khnyom-render-bug

 

  • On Android, there are different rendering issues, for example the font is too large for the keys by default (showing the original keyboard as I don’t have an Android device handy for an up-to-date screenshot).

khmer-on-android-render-bug

Get the keyboard + source

Despite these limitations, the keyboard should be usable for typing most Khmer text today, as least on iOS. It certainly covers most of my needs as a language learner today. As such, I’ve made it available for install into Keyman for iPhone and iPad or Keyman for Android.

First, install the app from the appropriate link above. Then click the link below to add the keyboard to your device. I haven’t included a font in the keyboard, so you’ll need a device which already supports rendering the script.

Install the keyboard (version 1.1)

Install the keyboard (version 1.2)

New! Install the Khmer Angkor beta keyboard (version 1.0) – by Makara Sok

I welcome any feedback, of course!

The source of the keyboard is available on GitHub at https://github.com/mcdurdin/experimental-keyboards.

GUI Info Utility for Windows

A very quick one today. GUIInfo is a tiny little utility that has a narrow purpose: it shows, in realtime, the current active, focus, foreground and capture window information for Windows.

Why this utility? Debugging focus issues is always frustrating. Attempting to observe current window focus information in a debugger results in the focus changing (to the debugger, of course). You can work around this – use remote debugging, or add logging in the debugger – but it’s a hassle.

Debugging focus issues would make Werner Heisenberg feel right at home.

Screenshot

GUIInfo Screenshot

Features

  • Updates 10 times a second so changes are reflected promptly.
  • Changes are highlighted in green, and slowly fade back to black.
  • Hovering over bold window labels will highlight the relevant window on the screen.
  • Topmost, so you can see the information without needing to try and keep the window visible.
  • Open Source – written in Delphi, MIT license.

Download and Source

An update for EncodeURIComponent

Way way back in the dark ages of Delphi XE2, I wrote a function to encode components of a URI. Now, this function has been updated for use on mobile platforms, by Nicolas Dusart, and I quote Nicolas:

I had to make some modifications on it to compile for the mobile platforms, as the strings are 0-based on these platforms.

I also modified it to escape non-ASCII characters using their UTF-8 encoding as the standards advices. For multi-bytes characters, each byte is percent-encoded as usual.

Here’s the code, maybe it could interests you and the future readers of that article 🙂

And here’s Nicolas’s updated function in all its glory:

function EncodeURIComponent(const ASrc: string): string;
const
  HexMap: string = '0123456789ABCDEF';

  function IsSafeChar(ch: Byte): Boolean;
  begin
    if (ch >= 48) and (ch <= 57) then Result := True    // 0-9
    else if (ch >= 65) and (ch <= 90) then Result := True  // A-Z
    else if (ch >= 97) and (ch <= 122) then Result := True  // a-z
    else if (ch = 33) then Result := True // !
    else if (ch >= 39) and (ch <= 42) then Result := True // '()*
    else if (ch >= 45) and (ch <= 46) then Result := True // -.
    else if (ch = 95) then Result := True // _
    else if (ch = 126) then Result := True // ~
    else Result := False;
  end;

var
  I, J: Integer;
  Bytes: TBytes;
begin
  Result := '';
    
  Bytes := TEncoding.UTF8.GetBytes(ASrc);
    
  I := 0;
  J := Low(Result);

  SetLength(Result, Length(Bytes) * 3); // space to %xx encode every byte

  while I < Length(Bytes) do
  begin
    if IsSafeChar(Bytes[I]) then
    begin
      Result[J] := Char(Bytes[I]);
      Inc(J);
    end
    else
    begin
      Result[J] := '%';
      Result[J+1] := HexMap[(Bytes[I] shr 4) + Low(ASrc)];
      Result[J+2] := HexMap[(Bytes[I] and 15) + Low(ASrc)];
      Inc(J,3);
    end;
    Inc(I);
  end;
    
  SetLength(Result, J-Low(ASrc));
end;

Many thanks, Nicolas 🙂

Fixing the incorrect client size for Delphi VCL Forms that use styles

Delphi XE2 and later versions have a robust theming system that has a frustrating flaw: the client width and height are not reliably preserved when the theme changes the border widths for dialog boxes.

For forms that are sizeable this is not typically a problem, but for dialogs laid out statically this can look really ugly, as shown in this Stack Overflow question.

The problem in pictures

Here’s a little form, shown in the Delphi form designer. I’ve placed 4 buttons right in the corners of the form. I’m going to populate the Memo with notes on the form size at runtime.

Design time form with four buttons at corners

When I have no custom style set to the project (i.e. “Windows” style), I can run on a variety of platforms and see the buttons are where they should be. Shown here on Windows 10, Windows 7 and Windows XP (just because):
Windows theme form on Windows 10Windows theme form on Windows 7

Windows theme form on Windows XP

But when I apply a custom style to the project — I chose “Glossy” — then my dialog appears like so, instead:

Glow theme form on Windows 7

You’ll note that the vertical is adjusted but the horizontal is not: Button2 and Button4 are now chopped off on the right. Because we are using themes, the form looks identical on all platforms.

This problem has not been addressed as of Delphi XE8.

The workaround

For my needs, I found a workaround using a class helper, which can be applied to the forms which need to maintain their design-time ClientWidth and ClientHeight. This is typically the case for dialog boxes.

This workaround should be used with care as it has been designed to address a single issue and may have side effects.

  • It will trigger resize events at load time
  • Setting AutoScroll = true means that ClientWidth and ClientHeight are not stored in the form .dfm, and so this does not work.
  • It may not address other layout issues such as scaled elements scaling wrongly (I haven’t tested this).
type
  TFormHelper = class helper for Vcl.Forms.TCustomForm
  private
    procedure RestoreDesignClientSize;
  end;

procedure TfrmTestSize.FormCreate(Sender: TObject);
begin
  RestoreDesignClientSize;
end;

{ TFormHelper }

procedure TFormHelper.RestoreDesignClientSize;
begin
  if BorderStyle in [bsSingle, bsDialog] then
  begin
    if Self.FClientWidth > 0 then ClientWidth := Self.FClientWidth;
    if Self.FClientHeight > 0 then ClientHeight := Self.FClientHeight;
  end;
end;

After adding in this little snippet, the form is now restored to its design-time size, like thus:

Fixed glow theme form on Windows 7

Success 🙂

Concatenating strings in SQL Server, or undefined behaviour by design

We just ran into a funny problem here, using a “tried and true” technique in SQL Server to concatenate strings. I use the quotes advisedly. This technique is often suggested on blogs and sites such as Stack Overflow, but we found out (by painful experience) that it is not to be relied on.

Update, 9 Mar 2016: Bruce Gordon from Webucator has turned this into a great little 5 minute video. Thanks Bruce! I don’t know anything much about Webucator, but they are doing some good stuff with creating well-attributed videos about blog posts such as this one and apparently they do SQL Server training.

The problem

So, given the following setup:

CREATE TABLE BadConcat (
  BadConcatID INT NOT NULL,
  Description NVARCHAR(100) NOT NULL,
  SortIndex INT NOT NULL
  CONSTRAINT PK_BadConcat PRIMARY KEY CLUSTERED (BadConcatID)
)
GO

INSERT BadConcat 
  SELECT 1, 'First Item', 1 union all
  SELECT 2, 'Second Item', 2 union all
  SELECT 3, 'Third Item', 3
GO

We need to concatenate those Descriptions. I have avoided fine tuning such as dropping the final comma or handling NULLs for the purpose of this example. This example shows one of the most commonly given answers to the problem:

DECLARE @Summary NVARCHAR(100) = ''

SELECT @Summary = @Summary + ec.Description + ', '
FROM BadConcat ec
ORDER BY ec.SortIndex 

PRINT @Summary

And we get the following:

First Item, Second Item, Third Item, 

And that works fine. However, if we want to include a WHERE clause, even if that clause still selects everything, then we suddenly get something weird:

SET @Summary = ''

SELECT @Summary = @Summary + ec.Description + ', '
FROM BadConcat ec
WHERE ec.BadConcatID in (1,2,3)
ORDER BY ec.SortIndex 

PRINT @Summary

Now we get the following:

Third Item, 

What? What has SQL Server done? What’s happened to the first two items?

You’ll probably do what we did, which is to go through and make sure that you are selecting everything properly, which we are, and eventually come to the conclusion that “there must be a bug in SQL Server”.

The answer

It turns out that this iterative concatenation is unsupported functionality. Microsoft Knowledge Base article 287515 states:

You may encounter unexpected results when you apply any operators or expressions to the ORDER BY clause of aggregate concatenation queries.

Now, at first glance that does not directly apply. But we can extrapolate from that, as Microsoft developer support have done, in response to a bug report on SQL Server, to learn that:

The variable assignment with SELECT statement is a proprietary syntax (T-SQL only) where the behavior is undefined or plan dependent if multiple rows are produced

And again, in response to another bug report:

Using assignment operations (concatenation in this example) in queries with ORDER BY clause has undefined behavior. This can change from release to release or even within a particular server version due to changes in the query plan. You cannot rely on this behavior even if there are workarounds.

Some alternative solutions are given, also, in that second report:

The ONLY guaranteed mechanism are the following:

1. Use cursor to loop through the rows in specific order and concatenate the values
2. Use for xml query with ORDER BY to generate the concatenated values
3. Use CLR aggregate (this will not work with ORDER BY clause)

And the article “Concatenating Row Values in Transact-SQL” by Anith Sen goes through some of those solutions in detail. Sadly, none of them are as clean or as easy to understand as that original example.

Another example is given on Stack Overflow, which details how to safely use XML PATH to concatenate, without breaking on the XML special characters &, < and >. Applying that example into my problem code given above, we should use the following:

SELECT @Summary = (
  SELECT ec.Description + ', ' 
  FROM BadConcat ec 
  WHERE ec.BadConcatID in (1,2,3)
  ORDER BY ec.SortIndex 
  FOR XML PATH(''), TYPE
).value('.','varchar(max)')

PRINT @Summary

Voilà.

First Item, Second Item, Third Item, 

A workaround for a record.property getter bug in the Delphi XE2 compiler

We recently ran into a nasty little bug in the Delphi XE2 compiler, that arises with a complex set of conditions:

  1. A record with a string property that has a get function;
  2. A call to this getter that has a constant string appended to the result;
  3. This result passed directly to another arbitrary function.

When these conditions are met (see code below for examples), the compiler generates code that crashes.

Reproducing the bug

The following program is enough to reproduce the bug.

program ustrcatbug;

type
  TWrappedString = record
  private
    FValue: string;
    function GetValue: string;
  public
    property Value: string read GetValue;
  end;

{ TWrappedString }

function TWrappedString.GetValue: string;
begin
  Result := FValue;
end;

function GetWrappedString: TWrappedString;
begin
  Result.FValue := 'Something';
end;

begin
  writeln(GetWrappedString.Value + ' Else');
end.

Looking at the last line of code from that sample in the disassembler, we see the following code:

ustrcatbug.dpr.28: writeln(GetWrappedString.Value + ' Else');
0040712A 8D45EC           lea eax,[ebp-$14]
0040712D E816E9FFFF       call GetWrappedString
00407132 8D55EC           lea edx,[ebp-$14]
00407135 B8C0BB4000       mov eax,$0040bbc0
0040713A 8B0DE8594000     mov ecx,[$004059e8]
00407140 E883D8FFFF       call @CopyRecord
00407145 8D55E8           lea edx,[ebp-$18]
00407148 B8C0BB4000       mov eax,$0040bbc0
0040714D E8D6E8FFFF       call TWrappedString.GetValue
00407152 8D45E8           lea eax,[ebp-$18]
00407155 BAB4714000       mov edx,$004071b4
0040715A E899D6FFFF       call @UStrCat
0040715F 8B10             mov edx,[eax]
00407161 A12C884000       mov eax,[$0040882c]
00407166 E86DC6FFFF       call @Write0UString
0040716B E868C7FFFF       call @WriteLn
00407170 E867BCFFFF       call @_IOTest
ustrcatbug.dpr.29: end.
00407175 33C0             xor eax,eax

The problem arises with these two lines:

0040715A E899D6FFFF       call @UStrCat
0040715F 8B10             mov edx,[eax]

The problem is that eax is not preserved through function calls with Delphi’s default register calling convention. And, as _UStrCat is a procedure, we can make no assumptions about the return value (which is passed in eax):

procedure _UStrCat(var Dest: UnicodeString; const Source: UnicodeString);

The issue does not arise if we avoid using the property:

  writeln(GetWrappedString.GetValue + ' Else');

From this, the compiler generates:

ustrcatbug.dpr.28: writeln(GetWrappedString.GetValue + ' Else');
0040712A 8D45E8           lea eax,[ebp-$18]
0040712D E816E9FFFF       call GetWrappedString
00407132 8D55E8           lea edx,[ebp-$18]
00407135 B8C0BB4000       mov eax,$0040bbc0
0040713A 8B0DE8594000     mov ecx,[$004059e8]
00407140 E883D8FFFF       call @CopyRecord
00407145 B8C0BB4000       mov eax,$0040bbc0
0040714A 8D55EC           lea edx,[ebp-$14]
0040714D E8D6E8FFFF       call TWrappedString.GetValue
00407152 8D45EC           lea eax,[ebp-$14]
00407155 BAB4714000       mov edx,$004071b4
0040715A E899D6FFFF       call @UStrCat
0040715F 8B55EC           mov edx,[ebp-$14]
00407162 A12C884000       mov eax,[$0040882c]
00407167 E86CC6FFFF       call @Write0UString
0040716C E867C7FFFF       call @WriteLn
00407171 E866BCFFFF       call @_IOTest
ustrcatbug.dpr.29: end.
00407176 33C0             xor eax,eax

Where we now see that edx is reloaded from the stack, as it should be:

0040715A E899D6FFFF       call @UStrCat
0040715F 8B55EC           mov edx,[ebp-$14]

Workarounds

There are a number of possible workarounds:

  1. Upgrade to Delphi XE7 – this problem appears to be resolved, though I could not find a QC or RSP report relating to the bug when I searched. Moving to XE7 is not an option for us in the short term: too many code changes, too many unknowns.
  2. Don’t use properties in records. This means changing code to call functions instead: no big deal for the Get, but annoying for the Set half of the function pair.
  3. Patch around the problem by modifying UStrCat to return the address of the Dest parameter.

In the end, I wrote option (3) but we went with option (2) in our code base.

Because UStrCat is implemented in System.pas, it’s difficult to build your own version of the unit. One way to skin the option 3 UStrCat is to copy the implementation of UStrCat and its dependencies (a bunch of memory and string manipulation functions), and monkeypatch at runtime.

In the copied functions, we need to preserve eax through the function calls. This results in the addition of 4 lines of assembly to the UStrCat and UStrAsg functions, pushing the eax register onto the stack and popping it before exit. I haven’t reproduced the code here because the original is copyrighted to Embarcadero, but here are the changes required:

  1. In UStrCat, Add push eax just after the conditional jump to UStrAsg, and pop eax just before the ret instruction.
  2. In UStrAsg, wrap the call to FreeMem with a push eax and pop eax.

The patch function is also pretty straightforward:

procedure MonkeyPatch(OldProc, NewProc: PBYTE);
var
  pBase, p: PBYTE;
  oldProtect: Cardinal;
begin
  p := OldProc;
  pBase := p;

  // Allow writes to this small bit of the code section
  VirtualProtect(pBase, 5, PAGE_EXECUTE_WRITECOPY, oldProtect);

  // First write the long jmp instruction.
  p := pBase;
  p^ := $E9;  // long jmp opcode
  Inc(p);
  PDWord(p)^ := DWORD(NewProc) - DWORD(p) - 4;  // address to jump to, relative to EIP

  // Finally, protect that memory again now that we are finished with it
  VirtualProtect(pBase, 5, oldProtect, oldProtect);
end;

function GetUStrCatAddr: Pointer; assembler;
asm
  lea  eax,[email protected]
end;

initialization
  MonkeyPatch(GetUStrCatAddr, @_UStrCatMonkey);
end.

This solution does give me the heebie jeebies, because we are patching the symptoms of the problem as we’ve seen them arise, without being able to either understand or address the root cause within the compiler. It’s not really possible to guarantee that there won’t be some other code path that causes this solution to come unstuck without really digging deep into the compiler’s code generation.

As noted, this problem appears to be resolved in Delphi XE5 or possibly earlier; however it is unclear if the root cause of the problem has been addressed, or we just got lucky. The issue has been reported as RSP-10255.

How to render combining marks consistently across platforms: a long story

I read today Richard Ishida’s notes on his experiences displaying combining characters in a consistent way across browsers, and recalled with fondness the pain we have experienced with this, both in browsers and in apps, over many years.

When we design a keyboard layout for Keyman, we will often want to show a diacritic mark or combining character from a complex script by itself on a key cap, or show the mark with a substituted placeholder symbol that is distinct from the script itself, to show how the combining character fits with the base letter.

A commonly used placeholder symbol is ◌ U+25CC DOTTED CIRCLE, but there are others that are preferred for some languages.

U+25CC

I show below three attempts at presenting the combining marks on our Open Source KeymanWeb platform keyboards, running in Chrome on Windows 8.1. The first shows a Lao keyboard, with each of the combining characters presented around a base  U+0E81 LAO LETTER KO on the keyboard. As you can see, this is not wonderful because the Ko Kai letter adds noise and makes it hard to pick out the combining characters.

Lao keyboard on keymanweb.com
Lao keyboard on keymanweb.com

The second keyboard is Thai. This relies on the system renderer and does not include a base character for the combining characters. In this case, that means that no base letter is shown and the marks show reasonably clearly. Unfortunately, this is not reliable across platforms, browsers or languages — it happens to work okay for Thai in Chrome on Windows 8.1.  On some other systems, the combining marks are shifted to the far left of the key cap, or even off the key entirely.

Thai keyboard on keymanweb.com
Thai keyboard on keymanweb.com

The third keyboard example is for Gujarati, and again relies on the system renderer and does not include a base character for the combining characters. Now this time you see that the Uniscribe renderer on Windows 8.1 automatically inserts a dotted circle glyph — this may or may not be the same as the U+25CC character. Again, this behaviour differs across platforms.

Gujarati keyboard on keymanweb.com
Gujarati keyboard on keymanweb.com

The Story Today

The story today is pretty unfortunate. Display of isolated combining marks is not implemented at all consistently across the plethora of rendering engines, operating systems, browsers and fonts. Some systems will automatically insert a placeholder glyph if no conforming base character is found prior to the combining character. Others don’t. What do I mean by a conforming base character? Well, that’s up to the implementation of the renderer. Some renderers will treat any base character that is not in the same script as the combining character as part of a separate run, and won’t attempt to combine. Others will combine with some outside-script characters but not others.

Things get even worse when working with multiple combining characters without a base character. Unless this has been specifically handled in the renderer, you will typically see poor layout or rendering of the characters that may not be representative of how they form above normal base characters.

The image below shows examples of the marks U+0EB5 LAO VOWEL SIGN II, U+0EB5 LAO VOWEL SIGN II + U+0EC8 LAO TONE MAI EK and U+0EB3 LAO VOWEL SIGN AM, rendered with and without U+25CC DOTTED CIRCLE, on various systems. A more comprehensive set of examples is included later in this blog.

Sample Lao combining character renders
Sample Lao combining characters rendered

You can see, in the image above, examples of:

  • Repeated dotted circles
  • Diacritics that don’t align over their base character correctly
  • Rendering errors leading to tofu
  • Diacritics that don’t stack
  • Marks that overlap incorrectly

None of the basic text-only solutions work consistently across all systems.

It’s like stuffing a live octopus into a box

Attempting to address the problem is not fun. Fixing the problem on one browser/renderer/OS/font system invariably creates a new problem on another system. It’s like stuffing a live octopus into a box: each time you manage to get one of those pesky tentacles into the box, two others have snuck out the other side.

A Solution

We worked hard to find a consistent cross-browser solution. We started with Lao, but we believe that the same principles can be applied to other scripts. The solution relies on web fonts, but falls back to an acceptable solution in the shrinking subset of systems where web fonts are not supported.

We started by creating a font that is designed for use specifically with the On Screen Keyboard — it would never be used for presentation or editing. While at first glance this meant we could easily solve the problem by placing all the required marks into the Private Use Area, doing so would mean that on systems where the font was not available, the keyboard would be unusable.

Our final solutionhack uses the OpenType GSUB feature to replace the  U+0E81 LAO LETTER KO with a placeholder mark when, and only when, it is used as a base character for a combining mark. The keyboard presentation code stores the Ko Kai letter as a base character for each combining character (or characters), and at render time the font substitutes its placeholder mark. When a key with this pair of characters is clicked, only the combining mark is displayed.

Lao keyboard with substituted base
Lao keyboard with substituted base

Doing this means we can use the Ko Kai letter by itself on the keyboard (on the D key) without a problem (for example on its own key), and in the case where the special font is not available, the user will see an acceptable fallback with the Ko Kai letter as the base character for combining characters.

It’s worth noting that many renderers require that the placeholder character be the same font and style as the combining characters. Today, this sadly precludes using a different colour for the placeholder character.

Examples

I have collected screenshots from a variety of browsers that illustrate some of the inconsistencies and frustrations, as well as the happily correct rendering. You can also visit the sample page to see how well the rendering works in your system.

The screenshots include three combinations:

  • A simple single U+0EB5 LAO VOWEL SIGN II
  • A pair of diacritic marks U+0EB5 LAO VOWEL SIGN II + U+0EC8 LAO TONE MAI EK
  • A third combining character U+0EB3 LAO VOWEL SIGN AM

The sample shows these three combinations with the default font for the system, a Lao OpenType font “Saysettha Web”, and the special font “SaysetthaX Web” which has our hack in it, along with three different base characters: a hard-coded U+25CC DOTTED CIRCLE, a U+00A0 NO-BREAK SPACE, and a U+0E81 LAO LETTER KO.

Initially, the dotted circle with the Saysettha Web font looks hopeful.  But in the end you will see that only one solution works on all browsers tested: the bottom right cell with the custom font and the Lao Letter Ko. The “hollow x” glyph that we use in this font is similar to the placeholder used in Lao language primers. However, any shape can of course be used here if you are creating your own font.

Windows 8.1 – Internet Explorer

Windows 8.1 - Internet Explorer 11

 

Internet Explorer 11 on Windows 8.1 does surprisingly badly here — worse than earlier versions of Windows.  Notice how the default Lao font (DokChampa) does not combine with dotted circle, nor does it combine the pair of diacritics correctly with no-break space.

Windows 8.1 – Chrome

Windows 8.1 - Chrome

 

While Chrome stacks the pair of diacritics correctly in all cases, the positioning for Dok Champa relative to the base character is incorrect for both dotted circle and no-break space. You can also see how the Saysettha Web and SaysetthaX Web fonts position their diacritics to the left of the cell in the case of the no-break space.

Windows 8.1 – Firefox

Windows 8.1 - Firefox

 

Firefox renderers similarly to Chrome. Note that it appears to be using Lao UI instead of DokChampa for the default font.

Windows 7 – Internet Explorer

Windows 7 - Internet Explorer 9

Windows 7 – Chrome

Windows 7 - Chrome

 

(Font smoothing was not enabled on the remote connection when I captured the screenshot)

Windows 7 – Firefox

Windows 7 - Firefox

(Font smoothing was not enabled on the remote connection when I captured the screenshot)

Windows XP – Firefox

Windows XP - Firefox

 

(Font smoothing was not enabled on the remote connection when I captured the screenshot)

Mac OS X – Safari

Mac OS X - Safari

 

Here we see the Saysettha Web solution with dotted circle, which worked consistently in Windows, fails in Mac OS X. Cross-platform support comes back to haunt us. Get back into that box, O tentacle!

Mac OS X – Chrome

Mac OS X - Chrome

Mac OS X – Firefox

Mac OS X - Firefox

It appears that Firefox is having trouble loading the Saysettha Web font, although the SaysetthaX Web font is loading, so this problem is probably resolvable.

iOS 8.1 – Safari

iOS 8.1 - Safari

 

This is at least consistent with the situation in Safari on Mac OS X 10.8!

 

Windows Phone 8.1 – Internet Explorer

Windows Phone 8.1 - Internet Explorer 11

 

This is similar, but not identical to Internet Explorer 9 on Windows 7. That would be too easy!

 

Android 4.3 – Android Browser

Android - Android Browser

 

Clearly, no Lao font was available on the system, so the web fonts were the only ones that worked.

Android 4.3 – Chrome

Android - Chrome

Clearly, no Lao font was available on the system, so the web fonts were the only ones that worked.

Android 4.3 – Firefox

Android - Firefox

Clearly, no Lao font was available on the system. It also appears that Firefox is having trouble loading the Saysettha Web font, although the SaysetthaX Web font is loading, so this problem is probably resolvable.

Conclusion

In summary, the only solution today for on screen keyboards (or character pickers) that works consistently across all browsers is to create a custom font with what can only be classed as a hack, and supply that font for use within those pickers / on screen keyboards.

I certainly agree with Richard that a more formalised solution based on U+00A0 no-break space or U+25CC dotted circle would be fantastic.

My Disordered Rebuttal to “AngularJs Performance in Large Applications”

After reading Abraham Polishchuk’s article AngularJS Performance in Large Applications, I wanted to dig a bit deeper on his suggestions before I took any of them on board. My basic concern was many of his suggestions seemed focused around technology-specific performance micro-optimisations. In today’s browser world, optimising around specific technologies (i.e. browsers and browser versions) is an expensive task with a limited life and usefulness: both improvements in the Javascript engines and differences in browser implementations mean that these fine-grained optimisations should really be limited to very specific performance critical situations.

Conversely, I don’t think we can expect Angular’s performance to improve in the same way as browser engines have. While Angular has undergone significant iterative performance improvements in the 1.x series, it is clear from the massive redesign of version 2.0 that most future application performance gains will only be won by rewriting significant chunks of your own code. Be aware!

Understanding computational complexity is crucial for web developers, even though with Angular it is all Javascript and browser and cloud and magic. You still need that software engineering education.

Onto the article.

1-2 Introduction, Tools of the Trade

No arguments here. Helpful.

3 Software Performance

Ok.

4.1 Loops

This advice, to move function calls out of loops, really depends on what the function you are calling is. Now, clearly, it’s sensible to move calls out of a loop if you can – that’s a normal O(n) to O(1) optimisation. And Objects.keys, as shown in the original article, is very heavy function, so that’d be a good one to avoid.

But don’t take this advice too far. The actual cost of the function call itself is minimal (http://jsperf.com/for-loop-perf-demo-basic/2). In general, inlining code to avoid a function call is not advisable: it results in less maintainable code for a minimal performance gain.

4.2 DOM access

While modifications to the DOM should be approached carefully, the reality is a web application is all about modifying the DOM, whether directly, or indirectly. It cannot be avoided! Applying inline styles (as opposed to using CSS selectors predefined in stylesheets?) vary significantly depending on both the styles being set and the browser. It’s certainly not a hard-and-fast rule, and not really a useful one. In the example, I compare background color (which should not force a reflow), and font-size (which would): (http://jsperf.com/inline-style-vs-class). The results vary dramatically across browsers.

4.3 Variable Scope and Garbage Collection

No arguments here. In fact I would go further and say that this is the single biggest performance concern, both in terms of CPU and resource usage, in Angular. In my limited experience, anyway. It’s very easy to leak closures without realising.

4.4 Arrays and Objects

Yes. Well, somewhat. As Gregory Jacobs commented, there is no significant difference between single-type and multi-type arrays. The differences come in what you are doing with the members of that array, and again browsers vary. http://jsperf.com/object-tostring-vs-number-tostring

I would again urge you to avoid implementation-specific optimisations. Designing an application to minimize the number of properties in an object just because today’s implementation of V8 has a special internal representation of small objects is not good advice. Design your objects according to your application’s requirements! This is a good example of a premature optimization antipattern.

Finally, as Wills Bithrey commented, Array.delete isn’t really slow.

5.1 Scopes and the Digest Cycle

This is a clear description of Angular scopes and the digest cycle. Good stuff. Abraham does a good job of unpacking the ‘magic’ of Angular.

6.1. Large Objects and Server Calls

Ok, so in my app I return full rows of data, and there are a few columns I don’t *currently* use in that data. I measured the performance impact on my application, and the cost of keeping those columns was insignificant: I was actually unable measure any difference. Again, this is a premature optimization antipattern. Your wins here will ultimately not be from excluding columns from your data but from making sure your data is normalized appropriately. This is about good design, rather than performance-constrained design.

Of course, there are plenty of other reasons to minimize your data across the wire. But remember that a custom serializer for each database object is another function to test, and another failure point. Measure the cost and the benefit.

6.2 Watching Functions

Yes. Never is a little too strong for me, but yeah, in general, avoid binding to functions.

6.3 Watching Objects

Yes, in general, deep watching of objects does introduce performance concerns. But I do not agree that this is a terrible idea. The alternative he gives (“relying on services and object references to propagate object changes between scopes”) gets pretty complicated and introduces maintenance and reliability concerns. I would approach this point as potential low hanging fruit for optimisation.

7.1 Long Lists

Yes.

7.2 Filters

As DJ commented, filters don’t work by hiding elements by CSS. Filters, as used in ng-repeat, return a transform of the array, which may be what Abraham meant?

7.3 Updating an ng-repeat

Pretty much. I would add that for custom syncing to work, you still need a unique key, just as you do for track by. If you are using Angular 1.3, there’s no benefit to the custom sync approach. Abraham has detailed the maintenance cost of custom syncing.

8. Rendering Problems

ng-if and ng-show are both useful tools. Choose your weapons wisely.

9.1 Bindings

Yes.

9.2 $digest() and $apply()

Yes, if you must. This is a great way to introduce inconsistencies in your application which can be very hard to debug!

9.3 $watch()

I really disagree with this advice. scope.$watch is a very practical way of separating concerns. I don’t think it’s indicative of bad architecture, quite the opposite in fact! While trying to find the discussions that Abraham refers to (but sadly doesn’t link to), it seemed that most of the alternatives given were in fact in themselves bad architecture. For example, Ben Lesh suggested (although he has since relaxed his opinion) using an ng-change on an input field instead of a watcher on the target variable. Why is that bad? From an architectural point of view, that means each place in your code that you change that variable, you have to remember to make that function call there as well. This is antithetical to the MVC pattern and seriously WET.

9.4 $on, $broadcast, and $emit

Apart from the advice to unbind $on(‘$destroy’) which I don’t think is correct, yes, being aware of the performance impact of events is important in an event-driven system. But that doesn’t mean don’t use them: events are a weapon to use wisely. Seeing a pattern here?

9.5 $destroy

Yes. Apart from the advice to “explicitly call your $on(‘$destroy’)” — I’m not sure what this means.

10.1 Isolate Scope and Transclusion

Only one quibble here: isolate scopes or transclusions can be faster than occupying the parent scope, because a tidy directive can use $apply to just update its own isolated scope, and avoid the digest of the entire scope tree. Choose your tools wisely.

10.2 The compile cycle

Basically, yes.

11 DOM Event Problems

Yes, addEventListener is more efficient, but so is not using Angular in the first place! This results in a significant code increase and again you the benefit of Angular in doing so. Going back to raw DOM is always a trade-off. In the majority of cases, the raw DOM event model is a whole lot of additional work for little gain.

12 Summary

In summary, I really think this post is about micro-optimisations that in many cases will not bring you the performance benefits you are looking for. Abraham’s summary suggests avoiding many of the most useful tools in Angular. If you have to roll your own framework code to avoid ng-repeat, or ng-click, you have already lost.

Have I got the wrong end of the stick?

IE11, Windbg, JavaScript = joy

I was trying to debug a web application today, which is running in an Internet Explorer MSHTML window embedded in a thick client application. Weirdly, the app would fail, silently, without any script errors, on the sixth refresh — pressing F5 six times in a row would crash it each and every time.  Ctrl+F5 or F5, either would do it.

I was going slightly mad trying to trace this, with no obvious solutions. Finally I loaded the process up in Windbg, not expecting much, and found that three (handled) C++ exceptions were thrown for each of the first 6 loads (initial load, + 5 refreshes):

(958.17ac): C++ EH exception - code e06d7363 (first chance)
(958.17ac): C++ EH exception - code e06d7363 (first chance)
(958.17ac): C++ EH exception - code e06d7363 (first chance)

However, on the sixth refresh, this exception was happening four times:

(958.17ac): C++ EH exception - code e06d7363 (first chance)
(958.17ac): C++ EH exception - code e06d7363 (first chance)
(958.17ac): C++ EH exception - code e06d7363 (first chance)
(958.17ac): C++ EH exception - code e06d7363 (first chance)

This gave me something to look at! At the very least there was an observable difference between the refreshes within the debugger. A little more spelunking revealed that the very last exception thrown was the relevant one, and it returned the following trace (snipped):

(958.17ac): C++ EH exception - code e06d7363 (first chance)
ChildEBP RetAddr  
001875e8 75d3359c KERNELBASE!RaiseException+0x58
00187620 72ba1df2 msvcrt!_CxxThrowException+0x48
00187664 72d659cb jscript9!Js::JavascriptExceptionOperators::ThrowExceptionObjectInternal+0xb4
0018767c 72ba1cda jscript9!Js::JavascriptExceptionOperators::ThrowExceptionObject+0x15
001876a8 72bb5175 jscript9!Js::JavascriptExceptionOperators::Throw+0x6e
001876f4 6e9acd03 jscript9!CJavascriptOperations::ThrowException+0x9c
0018771c 6f3029f5 MSHTML!CFastDOM::ThrowDOMError+0x70
00187750 72b68e9d MSHTML!CFastDOM::CWebSocket::DefaultEntryPoint+0xe2
001877b8 72cff6a2 jscript9!Js::JavascriptExternalFunction::ExternalFunctionThunk+0x165
00187810 72bb849c jscript9!Js::JavascriptFunction::CallAsConstructor+0xc7
00187830 72bb8537 jscript9!Js::InterpreterStackFrame::NewScObject_Helper+0x39
0018784c 72bb858c jscript9!Js::InterpreterStackFrame::ProfiledNewScObject_Helper+0x53
0018786c 72bb855e jscript9!Js::InterpreterStackFrame::OP_NewScObject_Impl<Js::OpLayoutCallI_OneByte,1>+0x24
00187ba8 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x3c44
00187dbc 132f1ae1 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
WARNING: Frame IP not in any known module. Following frames may be wrong.
00187dc8 72c372dd 0x132f1ae1
00187ed0 138be456 jscript9!Js::JavascriptFunction::EntryApply+0x265
00187f38 72b6669e 0x138be456
00188288 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
001883bc 132f0681 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
001883c8 72b6669e 0x132f0681
00188718 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018887c 132f1b41 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
00188888 72b6669e 0x132f1b41
00188bd8 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
00188d1c 132f1c21 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
00188d28 72bb7642 0x132f1c21
00189080 72bd501f jscript9!Js::InterpreterStackFrame::Process+0x2175
001890b8 72bd507e jscript9!Js::InterpreterStackFrame::OP_TryCatch+0x49
001893f8 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x4e84
00189594 132f0081 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
00189624 72d0134c 0x132f0081
00189650 72bf3de5 jscript9!Js::JavascriptOperators::GetProperty_Internal<0>+0x48
00189678 72b677a7 jscript9!Js::InterpreterStackFrame::OP_ProfiledLoopBodyStart<0,1>+0xa2
001899b8 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x24f7
00189b34 132f0089 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
00189b40 72b6669e 0x132f0089
00189e98 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018a004 132f0099 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018a010 72b6669e 0x132f0099
0018a368 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018a4bc 132f1c69 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018a4c8 72b6669e 0x132f1c69
0018a818 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018a95c 132f1c71 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018a968 72b6669e 0x132f1c71
0018acb8 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018ade4 132f1fb1 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018adf0 72c372dd 0x132f1fb1
0018ae90 72b60685 jscript9!Js::JavascriptFunction::EntryApply+0x265
0018aee0 72bd4fcc jscript9!Js::JavascriptFunction::CallFunction<1>+0x88
0018b220 72bd501f jscript9!Js::InterpreterStackFrame::Process+0x442e
0018b258 72bd507e jscript9!Js::InterpreterStackFrame::OP_TryCatch+0x49
0018b598 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x4e84
0018b71c 132f1ed9 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018b728 72b6669e 0x132f1ed9
0018ba78 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018bbac 132f1ca1 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018bbb8 72bb7642 0x132f1ca1
0018bf00 72bd501f jscript9!Js::InterpreterStackFrame::Process+0x2175
0018bf38 72bd507e jscript9!Js::InterpreterStackFrame::OP_TryCatch+0x49
0018c278 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x4e84
0018c3ac 132f1e21 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018c3b8 72bb7642 0x132f1e21
0018c700 72bd501f jscript9!Js::InterpreterStackFrame::Process+0x2175
0018c738 72bd507e jscript9!Js::InterpreterStackFrame::OP_TryCatch+0x49
0018ca78 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x4e84
0018cbb4 132f1e21 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018cbc0 72b6669e 0x132f1e21
0018cf08 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018d03c 132f1e51 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018d048 72b6669e 0x132f1e51
0018d398 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018d4bc 132f0341 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018d4c8 72bb7642 0x132f0341
0018d810 72bd501f jscript9!Js::InterpreterStackFrame::Process+0x2175
0018d848 72bd507e jscript9!Js::InterpreterStackFrame::OP_TryCatch+0x49
0018db88 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x4e84
0018dd44 132f1f99 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018dd50 72b60685 0x132f1f99
0018dd98 72bd4fcc jscript9!Js::JavascriptFunction::CallFunction<1>+0x88
0018e0d8 72bd501f jscript9!Js::InterpreterStackFrame::Process+0x442e
0018e110 72bd507e jscript9!Js::InterpreterStackFrame::OP_TryCatch+0x49
0018e450 72c5bfce jscript9!Js::InterpreterStackFrame::Process+0x4e84
0018e488 72c5bf0f jscript9!Js::InterpreterStackFrame::OP_TryFinally+0xb1
0018e7c8 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x68eb
0018e8f4 132f0351 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018e900 72b6669e 0x132f0351
0018ec48 72b66548 jscript9!Js::InterpreterStackFrame::Process+0xbd7
0018ed94 132f1cf9 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018eda0 72b66ce9 0x132f1cf9
0018f0f8 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x1cd7
0018f264 132f1d01 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018f270 72b66ce9 0x132f1d01
0018f5c8 72b66548 jscript9!Js::InterpreterStackFrame::Process+0x1cd7
0018f70c 132f1d49 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
0018f718 72b60685 0x132f1d49
0018f760 72b6100e jscript9!Js::JavascriptFunction::CallFunction<1>+0x88
0018f7cc 72b60f60 jscript9!Js::JavascriptFunction::CallRootFunction+0x93
0018f814 72b60ee7 jscript9!ScriptSite::CallRootFunction+0x42
0018f83c 72b6993c jscript9!ScriptSite::Execute+0x6c
0018f898 72b69878 jscript9!ScriptEngineBase::ExecuteInternal<0>+0xbb
0018f8b0 6eaab78f jscript9!ScriptEngineBase::Execute+0x1c
0018f964 6eaab67c MSHTML!CListenerDispatch::InvokeVar+0x102
0018f990 6eaab21e MSHTML!CListenerDispatch::Invoke+0x61
0018fa28 6eaab385 MSHTML!CEventMgr::_InvokeListeners+0x1a2
0018fb98 6e8a98bb MSHTML!CEventMgr::Dispatch+0x35a
0018fbc0 6e92d0c0 MSHTML!CEventMgr::DispatchEvent+0x8c
0018fd18 6e92da30 MSHTML!CXMLHttpRequest::Fire_onreadystatechange+0x8b
0018fd2c 6e9343ea MSHTML!CXMLHttpRequest::DeferredFire_onreadystatechange+0x52
0018fd5c 6e8a9472 MSHTML!CXMLHttpRequest::DeferredFire_progressEvent+0x10
0018fdac 773f62fa MSHTML!GlobalWndProc+0x1cd
0018fdd8 773f6d3a USER32!InternalCallWinProc+0x23
0018fe50 773f77c4 USER32!UserCallWinProcCheckWow+0x109
0018feb0 773f788a USER32!DispatchMessageWorker+0x3bc

So now I knew that somewhere deep in my Javascript code there was something happening that was bad. Okay… I guess that helps, maybe?

But then, after some more googling, I discovered the following blog post, published a mere 9 days ago: Finally… JavaScript source line info in a dump. Magic! (In fact, I’m just writing this blog post for my future self, so I remember where to find this info in the future.)

After following the specific incantations outlined within that post, I was able to get the exact line of Javascript that was causing my weird error. All of a sudden, things made sense.

0:000> k
ChildEBP RetAddr  
00186520 76bd22b4 KERNELBASE!RaiseException+0x48
00186558 5900775d msvcrt!_CxxThrowException+0x59
00186588 590076a3 jscript9!Js::JavascriptExceptionOperators::ThrowExceptionObjectInternal+0xb7
001865b8 5901503d jscript9!Js::JavascriptExceptionOperators::Throw+0x77
00186604 6365e79b jscript9!CJavascriptOperations::ThrowException+0x9c
0018662c 63e609aa mshtml!CFastDOM::ThrowDOMError+0x70
00186660 58f1f9a3 mshtml!CFastDOM::CWebSocket::DefaultEntryPoint+0xe2
001866c8 58f29994 jscript9!Js::JavascriptExternalFunction::ExternalFunctionThunk+0x165
00186724 58f2988c jscript9!Js::JavascriptFunction::CallAsConstructor+0xc7
00186744 58f29a4a jscript9!Js::InterpreterStackFrame::NewScObject_Helper+0x39
00186760 58f29a7a jscript9!Js::InterpreterStackFrame::ProfiledNewScObject_Helper+0x53
00186780 58f29aa9 jscript9!Js::InterpreterStackFrame::OP_NewScObject_Impl<Js::OpLayoutCallI_OneByte,1>+0x24
00186b58 58f26510 jscript9!Js::InterpreterStackFrame::Process+0x402b
00186d6c 14ea1ae1 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
WARNING: Frame IP not in any known module. Following frames may be wrong.
00186d78 58faa407 js!Anonymous function [http://marcdev2010:4131/app/main/main-controller.js @ 251,7]
00186e70 15511455 jscript9!Js::JavascriptFunction::EntryApply+0x267
00186ed8 58f268e6 js!d [http://marcdev2010:4131/lib/angular/angular.min.js @ 34,479]
001872c8 58f26510 jscript9!Js::InterpreterStackFrame::Process+0x899
001873f4 14ea0681 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
00187400 58f268e6 js!instantiate [http://marcdev2010:4131/lib/angular/angular.min.js @ 35,101]
001877e8 58f26510 jscript9!Js::InterpreterStackFrame::Process+0x899
00187944 14ea1b41 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
00187950 58f268e6 js!Anonymous function [http://marcdev2010:4131/lib/angular/angular.min.js @ 67,280]
00187d38 58f26510 jscript9!Js::InterpreterStackFrame::Process+0x899
00187e74 14ea1c21 jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x1e8
...

When I looked at the line in question, line 251 of main-controller.js, I found:

// TODO: We need to handle the web socket connection going down -- it really should be wrappered
//  try {
      $scope.watcher = new WebSocket("ws://"+location.host+"/");
//  } catch(e) {
//    $scope.watcher = null;
//  }
    
    $scope.$on('$destroy', function() {
      if($scope.watcher) {
        $scope.watcher.close();
        $scope.watcher = null;
      }
    });

It turns out that if I press F5, or call browser.navigate from the container app, the $destroy event was never called, and what’s worse, MSHTML wasn’t shutting down the web socket, either.  And by default, Internet Explorer has a maximum of six concurrent WebSocket connections. So six page loads and we’re done. The connection error is not triggering a script error in the MSHTML host, sadly, which is why it was so hard to track down.

Yeah, if I had already written the error handling for the WebSocket connection in the JavaScript code, this issue wouldn’t have been so hard to trace. But at least I’ve learned a useful new debugging technique!

Speculation and further investigation:

  • I’m still working on why the web socket is staying open between page loads when embedded.  It’s not straightforward, and while onbeforeunload is a workaround that, well, works, I hate workarounds when I don’t have a clear answer as to why they are needed.
  • It is possible that there is a circular reference leak or similar, but I did kinda think MS had sorted most of those with recent releases of IE.
  • This problem is only reproducible within the MSHTML embedded browser, and not within Internet Explorer itself. Changing $scope.watcher to window.watcher made no difference.
  • TWebBrowser (Delphi) and TEmbeddedWB (Delphi) both exhibited the same symptoms.
  • A far simpler document with a web socket call did not cause the problem.

Updated a few minutes later:

Yes, I have found the cause. It’s a classic Internet Explorer leak: a function within a closure that references the DOM. Simplified, it looks something like:

  window.addEventListener( "load", function(event) {
    var aa = document.getElementById('aa');
    var watcher = new WebSocket("ws://"+location.host+"/");
    watcher.onmessage = function() {
      aa.innerHTML = 'message received';
    }
    aa.innerHTML = 'WebSocket started';
  }, false);

This will persist the connection between  page loads, and quickly eat up your available WebSocket connections.  But it still only happens in the embedded web browser. Why that is, I am still exploring.

Updated 30 Nov 2015:

The reason this happens only in the embedded web browser is that by default, the embedded web browser runs in an emulation mode for IE7. Even with the X-UA-Compatible meta tag, you still need to add your executable to the FEATURE_BROWSER_EMULATION registry key.

Extending $resource in AngularJS

I’ve recently dived into the brave new world (for me) of AngularJS, for a development project for a client. I always enjoy learning new tools and frameworks, especially when there are good design principles and practices that I can apply to both the new project and filter back into existing code.

In this project, we have an existing backend that is delivering data through a RESTful JSON interface. And this is what $resource was designed for. The front end is a HTML document embedded in an existing thick-client application window. Yes, this is the real world.

The data returned by $resource can be either a single item, or an array of items — a collection. $resource automatically wraps each item in the array with the “class” of the single item, which is nice. This makes it trivial to extend items with helper functions, such as, in my case, a time conversion function for a specific field in the JSON data (pseudocode):

angular.module('appServices').factory('Widget', ['$resource',
  function($resource) {
    var Widget= $resource('/data/widgets/:widgetId.json', {}, {
      query: {method:'GET', params:{widgetId:''}, isArray:true}
    });

    Widget.prototype.createTimeInMinutes = function() {
      var m = moment(this.createDateTime);
      return m.hours()*60 + m.minutes();
    };
    
...

However, finding a way to extend the collection was also of interest to me. For example, to add an itemById function which would return a single item from the array identified by a unique identifier field. This is of course me applying my existing object-oriented brain to a Angular (FWIW, this post was the best intro to Angular that I have found, even though it’s about coming from jQuery and not from an OO world).

It seemed nice to me to be able to write something like collection.itemById(), or item.createTimeInMinutes(), associating these functions with the data that they manipulate. Object orientation doing what it does best.  While I was aware of advice around the dangers of extending built-in object prototypes — monkey-patching, I really wasn’t sure that the same concerns applied to extending an ‘instance’ of Array.

There were several answers on Stack Overflow that related to this, in some way, and helped me think through the problem. I (and others) came up with several possible solutions, none of which were completely beautiful to me:

  1. Extend the array returned from $resource.  This is actually hard to do, but in theory possible with transformResponse. Unfortunately, because AngularJS does not preserve extensions to Array objects, you lose those extensions very easily. I won’t add the code here because it is ultimately unhelpful.
  2. Wrap the array in a helper object, when loading in the controller:
    Resource.query().$promise.then(function(collection) {
      $scope.collection = new CollectionWrapper(collection);
    });

    This worked, again, but added a layer of muck to every collection which was unpalatable to me, and pushed implementation into the controller, which just felt like the wrong plce.

  3. Add a helper object:
    var CollectionHelper = {
      itemById = function(collection, id) {
        ...
      }
    };
    
    ...
    
    var item = CollectionHelper.itemById(collection, id);

    Again, this didn’t feel clean, or quite right, although it worked well enough.

  4. Finally, James suggested using a filter.
    angular.module('myapp').filter('byId', function() {
        return function(collection, id) {
          ...
        }
      });
    
    ...
    
    var item = $filter('byId')(collection, id);
    // or you can go directly if injected:
    var item = byIdFilter(collection,id);
    // and within the template you can use:
    {{collection | byId:id }}
    

This last is certainly the most Angular way of doing it.  I’m still not 100% satisfied, because filters have global scope, which means that we need to give them ugly names like collectionDoWonk, instead of just doWonk.

Is this the best way to skin this cat?