An update for EncodeURIComponent

August 4, 2015Computing, Delphi, Development, UnicodeMarc Durdin

Way way back in the dark ages of Delphi XE2, I wrote a function to encode components of a URI. Now, this function has been updated for use on mobile platforms, by Nicolas Dusart, and I quote Nicolas:

I had to make some modifications on it to compile for the mobile platforms, as the strings are 0-based on these platforms.

I also modified it to escape non-ASCII characters using their UTF-8 encoding as the standards advices. For multi-bytes characters, each byte is percent-encoded as usual.

Here’s the code, maybe it could interests you and the future readers of that article 🙂

And here’s Nicolas’s updated function in all its glory:

function EncodeURIComponent(const ASrc: string): string;
const
  HexMap: string = '0123456789ABCDEF';

  function IsSafeChar(ch: Byte): Boolean;
  begin
    if (ch >= 48) and (ch <= 57) then Result := True    // 0-9
    else if (ch >= 65) and (ch <= 90) then Result := True  // A-Z
    else if (ch >= 97) and (ch <= 122) then Result := True  // a-z
    else if (ch = 33) then Result := True // !
    else if (ch >= 39) and (ch <= 42) then Result := True // '()*
    else if (ch >= 45) and (ch <= 46) then Result := True // -.
    else if (ch = 95) then Result := True // _
    else if (ch = 126) then Result := True // ~
    else Result := False;
  end;

var
  I, J: Integer;
  Bytes: TBytes;
begin
  Result := '';
    
  Bytes := TEncoding.UTF8.GetBytes(ASrc);
    
  I := 0;
  J := Low(Result);

  SetLength(Result, Length(Bytes) * 3); // space to %xx encode every byte

  while I < Length(Bytes) do
  begin
    if IsSafeChar(Bytes[I]) then
    begin
      Result[J] := Char(Bytes[I]);
      Inc(J);
    end
    else
    begin
      Result[J] := '%';
      Result[J+1] := HexMap[(Bytes[I] shr 4) + Low(ASrc)];
      Result[J+2] := HexMap[(Bytes[I] and 15) + Low(ASrc)];
      Inc(J,3);
    end;
    Inc(I);
  end;
    
  SetLength(Result, J-Low(ASrc));
end;

Many thanks, Nicolas 🙂

Fixing the incorrect client size for Delphi VCL Forms that use styles

July 24, 2015Bugs, Computing, Delphi, Development, WindowsMarc Durdin

Delphi XE2 and later versions have a robust theming system that has a frustrating flaw: the client width and height are not reliably preserved when the theme changes the border widths for dialog boxes.

For forms that are sizeable this is not typically a problem, but for dialogs laid out statically this can look really ugly, as shown in this Stack Overflow question.

The problem in pictures

Here’s a little form, shown in the Delphi form designer. I’ve placed 4 buttons right in the corners of the form. I’m going to populate the Memo with notes on the form size at runtime.

Design time form with four buttons at corners

When I have no custom style set to the project (i.e. “Windows” style), I can run on a variety of platforms and see the buttons are where they should be. Shown here on Windows 10, Windows 7 and Windows XP (just because):
Windows theme form on Windows 10

Windows theme form on Windows XP

But when I apply a custom style to the project — I chose “Glossy” — then my dialog appears like so, instead:

Glow theme form on Windows 7

You’ll note that the vertical is adjusted but the horizontal is not: Button2 and Button4 are now chopped off on the right. Because we are using themes, the form looks identical on all platforms.

This problem has not been addressed as of Delphi XE8.

The workaround

For my needs, I found a workaround using a class helper, which can be applied to the forms which need to maintain their design-time ClientWidth and ClientHeight. This is typically the case for dialog boxes.

This workaround should be used with care as it has been designed to address a single issue and may have side effects.

It will trigger resize events at load time
Setting AutoScroll = true means that ClientWidth and ClientHeight are not stored in the form .dfm, and so this does not work.
It may not address other layout issues such as scaled elements scaling wrongly (I haven’t tested this).

type
  TFormHelper = class helper for Vcl.Forms.TCustomForm
  private
    procedure RestoreDesignClientSize;
  end;

procedure TfrmTestSize.FormCreate(Sender: TObject);
begin
  RestoreDesignClientSize;
end;

{ TFormHelper }

procedure TFormHelper.RestoreDesignClientSize;
begin
  if BorderStyle in [bsSingle, bsDialog] then
  begin
    if Self.FClientWidth > 0 then ClientWidth := Self.FClientWidth;
    if Self.FClientHeight > 0 then ClientHeight := Self.FClientHeight;
  end;
end;

After adding in this little snippet, the form is now restored to its design-time size, like thus:

Fixed glow theme form on Windows 7

Success 🙂

Concatenating strings in SQL Server, or undefined behaviour by design

July 15, 2015Bugs, Computing, Development, SQL, Suggestions, UncategorizedMarc Durdin

We just ran into a funny problem here, using a “tried and true” technique in SQL Server to concatenate strings. I use the quotes advisedly. This technique is often suggested on blogs and sites such as Stack Overflow, but we found out (by painful experience) that it is not to be relied on.

Update, 9 Mar 2016: Bruce Gordon from Webucator has turned this into a great little 5 minute video. Thanks Bruce! I don’t know anything much about Webucator, but they are doing some good stuff with creating well-attributed videos about blog posts such as this one and apparently they do SQL Server training.

The problem

So, given the following setup:

CREATE TABLE BadConcat (
  BadConcatID INT NOT NULL,
  Description NVARCHAR(100) NOT NULL,
  SortIndex INT NOT NULL
  CONSTRAINT PK_BadConcat PRIMARY KEY CLUSTERED (BadConcatID)
)
GO

INSERT BadConcat 
  SELECT 1, 'First Item', 1 union all
  SELECT 2, 'Second Item', 2 union all
  SELECT 3, 'Third Item', 3
GO

We need to concatenate those Descriptions. I have avoided fine tuning such as dropping the final comma or handling NULLs for the purpose of this example. This example shows one of the most commonly given answers to the problem:

DECLARE @Summary NVARCHAR(100) = ''

SELECT @Summary = @Summary + ec.Description + ', '
FROM BadConcat ec
ORDER BY ec.SortIndex 

PRINT @Summary

And we get the following:

First Item, Second Item, Third Item,

And that works fine. However, if we want to include a WHERE clause, even if that clause still selects everything, then we suddenly get something weird:

SET @Summary = ''

SELECT @Summary = @Summary + ec.Description + ', '
FROM BadConcat ec
WHERE ec.BadConcatID in (1,2,3)
ORDER BY ec.SortIndex 

PRINT @Summary

Now we get the following:

Third Item,

What? What has SQL Server done? What’s happened to the first two items?

You’ll probably do what we did, which is to go through and make sure that you are selecting everything properly, which we are, and eventually come to the conclusion that “there must be a bug in SQL Server”.

The answer

It turns out that this iterative concatenation is unsupported functionality. Microsoft Knowledge Base article 287515 states:

You may encounter unexpected results when you apply any operators or expressions to the ORDER BY clause of aggregate concatenation queries.

Now, at first glance that does not directly apply. But we can extrapolate from that, as Microsoft developer support have done, in response to a bug report on SQL Server, to learn that:

The variable assignment with SELECT statement is a proprietary syntax (T-SQL only) where the behavior is undefined or plan dependent if multiple rows are produced

And again, in response to another bug report:

Using assignment operations (concatenation in this example) in queries with ORDER BY clause has undefined behavior. This can change from release to release or even within a particular server version due to changes in the query plan. You cannot rely on this behavior even if there are workarounds.

Some alternative solutions are given, also, in that second report:

The ONLY guaranteed mechanism are the following:

1. Use cursor to loop through the rows in specific order and concatenate the values
2. Use for xml query with ORDER BY to generate the concatenated values
3. Use CLR aggregate (this will not work with ORDER BY clause)

And the article “Concatenating Row Values in Transact-SQL” by Anith Sen goes through some of those solutions in detail. Sadly, none of them are as clean or as easy to understand as that original example.

Another example is given on Stack Overflow, which details how to safely use XML PATH to concatenate, without breaking on the XML special characters &, < and >. Applying that example into my problem code given above, we should use the following:

SELECT @Summary = (
  SELECT ec.Description + ', ' 
  FROM BadConcat ec 
  WHERE ec.BadConcatID in (1,2,3)
  ORDER BY ec.SortIndex 
  FOR XML PATH(''), TYPE
).value('.','varchar(max)')

PRINT @Summary

Voilà.

First Item, Second Item, Third Item,

The case of the UAC that Just Wouldn’t

July 1, 2015Bugs, Computing, Tools, WindowsMarc Durdin

One of my dev machines has long had a weird anomaly where file operations in Explorer that should prompt for UAC, such as copying a file into C:\Program Files, would instead silently fail.

This led to all sorts of issues, from being unable to delete certain files — they’d just obstinately sit there, no matter how much I pressed that Del key — to trying to move folders containing a hidden Thumbs.db file and being unable to move the folder.

My UAC settings were the Windows defaults. Nothing weird these. So I’d always treated put this issue into my “too busy to solve this now” basket. The classic basket case. But today I finally got fed up.

After a quick search for the symptoms on Dr Google returned no results of significance, I decided I needed to trace the cause myself.

Process Monitor to the Rescue

It was time to pull out Process Monitor out of my toolbox again! Process Monitor is a tool from the SysInternals Suite by Microsoft that monitors and logs details on a bunch of different operations on your computer. I use Process Monitor, Procmon for short, all the time to solve problems big and little. But for some reason, it hadn’t crossed my mind until today that I could apply Procmon to this problem.

First, I configured Procmon to filter all events except for those generated by Explorer.exe and Consent.exe. I wasn’t sure if Consent.exe was involved in the problem (Consent.exe being the UAC elevation prompter), but it wouldn’t hurt to include it to start with. Note that all those Exclude filters are default filters setup by Procmon to exclude itself and its friends, removing that confusion from the logs.

Then I went ahead and tried to copy a file into C:\Program Files (x86). It was just an innocent little text file, but Explorer of course acted like a Buckingham Palace Guard and silently and stolidly ignored its existence.

➔

I used the clipboard Ctrl+C and Ctrl+V to copy and paste (or attempt to paste) the file. I didn’t think the clipboard was at fault because all other UAC-required file operations also failed silently. I could have dragged and dropped, it would have had the same effect.

But now, with procmon, I had captured the communication that went on behind the scenes. All those secret coded winks and nose scratches that told Explorer to fob off any attempts to trigger a UAC prompt. Here’s what I was presented with in the Procmon log.

I searched for the name of my text file (test.txt), and used Procmon’s Highlight tool to highlight every reference to it in the Path column. This made it easy to spot nearby interactions that may have been related, even if they didn’t directly reference the test.txt file itself. You can see below two of the highlighted test.txt lines.

Because there was a lot going on, I filtered out a lot of Operations that I thought were not relevant, such as CloseFile, RegCloseKey, RegQueryKey, ReadFile and WriteFile, among others. This reduced the log considerably and made it easier to spot differences (my screen capture below shows the filtering after it was reset, however — I forgot to capture the filtered trace, sorry).

I decided to also capture a trace on a machine where UAC prompting worked. I then compared the two logs. After scrolling back and forth around the many references to test.txt, I saw that on my dev machine, there was an additional interaction, right before the point where the prompt dialog was presented:

That’s right, I had a program called TortoiseCVS installed on this machine which hooked into Explorer in a variety of ways. After the FileOperationPrompt references on my second machine, there was no reference to TortoiseCVS. Here’s what it looked like on the other machine:

That was the only visible difference of significance in the logs.

Now for those of you who just knee-jerked into “why on earth are you using CVS?!?”, calm down! This is a story, and I’m telling the story.

I decided that I didn’t really need TortoiseCVS installed and decided to try uninstalling it.

Sadly, uninstalling it required a reboot, no doubt to remove its old fashioned hooks into Explorer.

After the reboot, I tried to copy my innocent little text file again.

Success! I was now presented with the prompt I wanted!

Another case closed thanks to SysInternals Suite and Mark Russinovich!

A deep dive into Periscope – and how to save the stream from the website

June 25, 2015Computing, ToolsMarc Durdin

So, yesterday I gave a presentation (the title isn’t important, but it was called “Internationalization Done Wrong”) at WebDev42°, our local web developer meetup (42°S is the latitude of Tasmania). Before the presentation, I tweeted out to see if anyone was interested in doing a Periscope of the talk.

Anyone want to experiment with periscoping my talk tonight? @webdev42

— Marc Durdin (@MarcDurdin) June 23, 2015

What is Periscope? It’s Twitter’s live video streaming platform, which makes it easy for anyone to live stream events, walks on the beach, or pretty much anything.

After the usual Twitteral conversation, Casey happily agreed to take on the job of cameraman.

https://twitter.com/kcidau/status/613510102415376384

And we were set to go. I told my friends on Twitter that I would be presenting at 6pm and would be experimenting with Periscope. A couple of them decided to try and watch.

Here’s what we learned.

The basic experience is really smooth

The premise of Periscope is anyone can host a live stream. That’s definitely true: setup was trivial and getting started was a matter of pressing the Broadcast button. And following the stream on other devices also worked pretty well. At least for mobile devices; not so much for desktops…

People will watch immediately

All sorts of strange people started watching, making helpful comments like “that guy has no hair” and “how come so many of you have beards?” and weird questions about Tasmania.

That strangers came to watch frankly surprised me, because we were a bunch of computer geeks talking about computer geekery, and that’s not really that interesting.

Keep your periscopes focused

I tweeted a link to the Periscope and a couple of my friends jumped on to watch almost immediately. This was unfortunate, because they ended up being put through 20 minutes of talk about beer and hair before the presentation started.

So start a periscope for general chat before the main event if you like, but start a new one for the presentation.

Turn off notifications and all the jazz on your phone

Notifications are seriously loud for the poor viewers of the video!

Talk loudly

I apparently talked too quietly for some of the periscopees (aka viewers).

Take your finger off the microphone

Yep. That’s an effective way to mute the stream. Speaking from experience here. (Suggested by @johndalton).

Name your periscopes

It wasn’t clear to people jumping into the stream what the event was all about. The name should probably have been “Web Dev 42 South Meetup – General Chat” for the first three periscopes (more about that in a moment). Then a new periscope should have been started for each presentation given.

As it turned out, a new periscope was started for each presentation, kinda, but not on purpose (learn why soon). And I only ever shared the link to the first periscope, so my friends were left trying to find the new streams (some of which ended up coming from different users as well).

And locate your periscopes

Updated 10:50pm: As Masni wrote on Twitter, turn on location so people can find your streams.

The Android client is kinda buggy

Yeah, that’s why we had three periscopes even before the event officially started. It’s because the Android client kept crashing. We finally switched to an iPad, but not before losing the last few minutes of my talk. (Not a big loss, to be sure!)

The iPad was much more stable.

The cameraperson probably needs to be an extrovert

So Casey is really not an introvert, which is good, because I would be too shy to run around filming people. He did a good job of that, even through all the chaos.

Share your periscopes

I got kinda busy after tweeting out the first link, so I didn’t really realise that the app had crashed and that everyone was having to find a new stream to follow. It definitely made things more complicated. If you are going to be doing a presentation, try and get someone else to look after tweeting out live things like that on your behalf. They probably won’t steal your phone or tweet anything too embarrassing.

Quality is variable

Because it’s a live stream, little network hiccups do sometimes happen. Audio disappeared for something like 30 seconds on one stream (no idea why). The video quality is not awesome but it’s certainly watchable. It’s like Youtube 2003 (was Youtube around in 2003?)

So all those little things aside, the general experience was still pretty cool. Definitely an easy way to share a video!

After the event

After complaints from my friends and rude comments from the audience, I realised I wanted to review my presentation efforts (i.e. the video) online. This ability to watch a saved stream later was one of the key reasons I chose to try Periscope over Meerkat (a very slightly older competitor).

How did that go?

The post-event experience is pretty minimal

Even though streams are saved so you can watch them asynchronously, the experience is pretty minimal. There is precisely one control: the play/pause button. That’s right, no fast-forward, no rewind, no skip, no ability to move to a specific point in the stream.

I was faced with watching 20 minutes of discussion about beer and hair just so I could review my presentation!

The website is a bit buggy

I frequently had trouble starting streams, and more than once a stream would fall over after a few minutes – and the only way to resolve this is to start again with a refresh (ouch!)

Looking at the situation with the Web Developer console, it appears that stream data requests were sometimes being denied with 403 errors. I didn’t dig into why, but that’s a bit of a deal-breaker.

You can only watch for 24 hours

After 24 hours, the public link is gone.

Now, you can save the stream onto your device, if you (a) remember to do that, and (b) have enough storage left. You can choose to have all streams saved by default, which does mean that point (b) would quickly become a truism!

Saving a Periscope Stream from the website

I still hadn’t managed to get further than 4 minutes into my presentation without the site throwing an error. So, after refreshing, and realising I’d have to sit through three minutes of pre-talk setup yet again, because no fast forward, remember (this was like the 4th periscope start after the Android crashes, so I could at least skip the first 17 minutes), I gave up on watching online. Instead, I decided to try and find a way to save the stream.

And I found my way through it! 🙂 And if I can do it, I’m sure you can.

Caveats apply here. You may not have permission to copy a stream because it is copyright and all that. Be good. They may change the website back-end and you’ll have to adapt with the changes. Remember this will only work in the first 24 hours after the event.

So here’s how you do it. I haven’t automated this (much) because that’s for someone else to do, later.

Visit the Periscope URL with Web Developer open.

I used Chrome but you can use pretty much any browser. We want to capture the network traffic to find the access token for the stream, so we can grab it with a tool.

Once the page loads, look for the getAccessPublic request, as shown below.

We’re going to want two different things out of that.

The replay_url, shown most easily in the Preview pane. Copy it to the clipboard and paste it into a document for later.

And the cookies. These are easier to copy from the Headers pane (Sorry, this is a later screenshot, but the principle still applies.).

You’ll need to copy the value of each Set-Cookie header, up until the first semicolon (;). Don’t include the “Set-Cookie” text itself. Paste these into a text document, separating the strings with semi-colons. Don’t add line breaks.

When you are done, you should have something like this (no line breaks, just automatic wordwrapping showing here):

The replay url will point to a path on a web server that contains a m3u8 format playlist file, and a set of MPEG-2 TS video files, which represent your video stream broken down into chunks.

Create a batch file to download the video

Now, we want to pass those variables you collected into a batch file, for simplicity. I used the tool wget to download the files from the command line.

Here’s the batch file code.

@echo off

set cookie="Cookie: <your-cookie-text-here>"

set url=<your-replay-url-here>

wget --no-cookies --header %cookie% --no-check-certificate %url%/playlist.m3u8

findstr "chunk" playlist.m3u8 > downloadlist.txt

for /f %%i in (downloadlist.txt) do wget --no-cookies --header %cookie% --no-check-certificate %url%/%%i

Replace the <your-cookie-text-here> and <your-replay-url-here> placeholders with your variables collected earlier. This script will download the playlist.m3u8 file, using the correct access permissions, and parse out the chunks of your video from that (pretty simple) file format into a download list. Then, it goes through the download list and downloads each chunk. Pretty straightforward.

(Why –no-check-certificate? Because the default root certificate list that comes with wget is out of date!)

Save and run the batch file.

It may take a little while to run, but after a bit, you’ll have collected all the different bits of your video for posterity onto your machine. The whole download took about 15 minutes on my 100mbit NBN link, which suggests that Periscope may be limiting the bandwidth of each user. No biggie, I went to lunch anyway.

By the time you get back from lunch, the video will be on your computer in hundreds of chunks, each between roughly 100KB and 300KB.

This is no fun. Some video players will load the playlist file and work through the chunks (e.g. VLC), but most of them stutter between the different chunks, which is pretty unwatchable. And it’s a pain to manage.

Combine the chunks into an mp4 video

So I wanted to combine those chunks into a single file for a smooth video experience. I used ffmpeg, which is the most powerful way to do conversions of weird and wonderful video file formats.

To convert these files, this is the command I ended up using (YMMV with that audio conversion parameter, which I didn’t really dig into):

ffmpeg.exe -i playlist.m3u8 -bsf:a aac_adtstoasc -vcodec copy -c copy -crf 50 test.mp4

And here’s what I saw:

The good news is that test.mp4 was generated in 2 seconds flat, and plays without stutters!

Fix the audio sync

But I found that the audio was almost exactly 2 seconds ahead of the video, which made things seem pretty weird. ffmpeg to the rescue again!

ffmpeg.exe -i test.mp4 -itsoffset 2 -i test.mp4 -map 0:0 -map 1:1 -acodec copy -vcodec copy test2.mp4

This command breaks apart the audio and video streams, then takes the audio stream and pauses it for two seconds, before recombining it with the video stream. This blog post explains the details of this trick.

Trim the video

Finally, I wanted to save my viewers the agony of watching the camera setup and preparation for the presentation. It was amusing at the event but not so much when sitting at a computer waiting for the real thing to start! So again, ffmpeg made this easy:

ffmpeg.exe -ss 00:02:45 -i test2.mp4 -acodec copy -vcodec copy final.mp4

And now we have a final.mp4 video, which works beautifully.

Shame about the guy in the video.

If you do actually want to watch my presentation, you can see all but the last two minutes on Youtube!

http://youtu.be/77pMT3Ogw7g

Doing a Google aka closing down/handing over some stuff

May 13, 2015Computing, Cycling, Strava, UncategorizedMarc Durdin

As I am taking on some new challenges at work, I realised I needed to clear the decks a bit with my side projects. So with some sadness, I’ve decided to close down or hand over the following cycling-related projects and activities to other people.

mesmeride.com

Mesmeride was a holiday project I put together to teach myself Ruby on Rails. Mesmeride renders the elevation profile from a Strava ride in a variety of ways, imitating the Giro d’Italia profiles among others. I did want to put some more time into this, having ideas for maps and more flexible profiles, but it isn’t going to happen.

The Mesmeride website was at http://mesmeride.com/, on Twitter @mesmeride, and Facebook at https://facebook.com/mesmeride.

The site will shut down in a couple of weeks — time to get your graphics off or to tell me you want to take over running it (currently costing $9/month for the Heroku instance).

Source code for the site will remain at https://github.com/mcdurdin/mesmeride

The Hobart 10,000

The Hobart 10,000 is an annual ride over 3 days where a bunch of keen cyclists tackle 10,000 metres of climbing on some of Hobart’s iconic hills and mountains. Barry Jones and Mark Breen (@clunkersrule) are already doing a great job of running this event — I’m just handing over the digital reins.

The Hobart 10,000 can be followed on Facebook at https://facebook.com/hobart10000, on Twitter at @Hobart10000, and on the Strava club. The website http://hobart10000.com/ will be closing down.

@TassieCup

The TassieCup Twitter account reports on the progress of Tasmanian cyclists in the major pro races. I’ve handed control of the account over to the inimitable Daniel Wood (@danielwood1).

A workaround for a record.property getter bug in the Delphi XE2 compiler

April 1, 2015Bugs, Computing, Delphi, Development, WindowsMarc Durdin

We recently ran into a nasty little bug in the Delphi XE2 compiler, that arises with a complex set of conditions:

A record with a string property that has a get function;
A call to this getter that has a constant string appended to the result;
This result passed directly to another arbitrary function.

When these conditions are met (see code below for examples), the compiler generates code that crashes.

Reproducing the bug

The following program is enough to reproduce the bug.

program ustrcatbug;

type
  TWrappedString = record
  private
    FValue: string;
    function GetValue: string;
  public
    property Value: string read GetValue;
  end;

{ TWrappedString }

function TWrappedString.GetValue: string;
begin
  Result := FValue;
end;

function GetWrappedString: TWrappedString;
begin
  Result.FValue := 'Something';
end;

begin
  writeln(GetWrappedString.Value + ' Else');
end.

Looking at the last line of code from that sample in the disassembler, we see the following code:

ustrcatbug.dpr.28: writeln(GetWrappedString.Value + ' Else');
0040712A 8D45EC           lea eax,[ebp-$14]
0040712D E816E9FFFF       call GetWrappedString
00407132 8D55EC           lea edx,[ebp-$14]
00407135 B8C0BB4000       mov eax,$0040bbc0
0040713A 8B0DE8594000     mov ecx,[$004059e8]
00407140 E883D8FFFF       call @CopyRecord
00407145 8D55E8           lea edx,[ebp-$18]
00407148 B8C0BB4000       mov eax,$0040bbc0
0040714D E8D6E8FFFF       call TWrappedString.GetValue
00407152 8D45E8           lea eax,[ebp-$18]
00407155 BAB4714000       mov edx,$004071b4
0040715A E899D6FFFF       call @UStrCat
0040715F 8B10             mov edx,[eax]
00407161 A12C884000       mov eax,[$0040882c]
00407166 E86DC6FFFF       call @Write0UString
0040716B E868C7FFFF       call @WriteLn
00407170 E867BCFFFF       call @_IOTest
ustrcatbug.dpr.29: end.
00407175 33C0             xor eax,eax

The problem arises with these two lines:

0040715A E899D6FFFF       call @UStrCat
0040715F 8B10             mov edx,[eax]

The problem is that eax is not preserved through function calls with Delphi’s default register calling convention. And, as _UStrCat is a procedure, we can make no assumptions about the return value (which is passed in eax):

procedure _UStrCat(var Dest: UnicodeString; const Source: UnicodeString);

The issue does not arise if we avoid using the property:

  writeln(GetWrappedString.GetValue + ' Else');

From this, the compiler generates:

ustrcatbug.dpr.28: writeln(GetWrappedString.GetValue + ' Else');
0040712A 8D45E8           lea eax,[ebp-$18]
0040712D E816E9FFFF       call GetWrappedString
00407132 8D55E8           lea edx,[ebp-$18]
00407135 B8C0BB4000       mov eax,$0040bbc0
0040713A 8B0DE8594000     mov ecx,[$004059e8]
00407140 E883D8FFFF       call @CopyRecord
00407145 B8C0BB4000       mov eax,$0040bbc0
0040714A 8D55EC           lea edx,[ebp-$14]
0040714D E8D6E8FFFF       call TWrappedString.GetValue
00407152 8D45EC           lea eax,[ebp-$14]
00407155 BAB4714000       mov edx,$004071b4
0040715A E899D6FFFF       call @UStrCat
0040715F 8B55EC           mov edx,[ebp-$14]
00407162 A12C884000       mov eax,[$0040882c]
00407167 E86CC6FFFF       call @Write0UString
0040716C E867C7FFFF       call @WriteLn
00407171 E866BCFFFF       call @_IOTest
ustrcatbug.dpr.29: end.
00407176 33C0             xor eax,eax

Where we now see that edx is reloaded from the stack, as it should be:

0040715A E899D6FFFF       call @UStrCat
0040715F 8B55EC           mov edx,[ebp-$14]

Workarounds

There are a number of possible workarounds:

Upgrade to Delphi XE7 – this problem appears to be resolved, though I could not find a QC or RSP report relating to the bug when I searched. Moving to XE7 is not an option for us in the short term: too many code changes, too many unknowns.
Don’t use properties in records. This means changing code to call functions instead: no big deal for the Get, but annoying for the Set half of the function pair.
Patch around the problem by modifying UStrCat to return the address of the Dest parameter.

In the end, I wrote option (3) but we went with option (2) in our code base.

Because UStrCat is implemented in System.pas, it’s difficult to build your own version of the unit. One way to skin the option 3 UStrCat is to copy the implementation of UStrCat and its dependencies (a bunch of memory and string manipulation functions), and monkeypatch at runtime.

In the copied functions, we need to preserve eax through the function calls. This results in the addition of 4 lines of assembly to the UStrCat and UStrAsg functions, pushing the eax register onto the stack and popping it before exit. I haven’t reproduced the code here because the original is copyrighted to Embarcadero, but here are the changes required:

In UStrCat, Add push eax just after the conditional jump to UStrAsg, and pop eax just before the ret instruction.
In UStrAsg, wrap the call to FreeMem with a push eax and pop eax.

The patch function is also pretty straightforward:

procedure MonkeyPatch(OldProc, NewProc: PBYTE);
var
  pBase, p: PBYTE;
  oldProtect: Cardinal;
begin
  p := OldProc;
  pBase := p;

  // Allow writes to this small bit of the code section
  VirtualProtect(pBase, 5, PAGE_EXECUTE_WRITECOPY, oldProtect);

  // First write the long jmp instruction.
  p := pBase;
  p^ := $E9;  // long jmp opcode
  Inc(p);
  PDWord(p)^ := DWORD(NewProc) - DWORD(p) - 4;  // address to jump to, relative to EIP

  // Finally, protect that memory again now that we are finished with it
  VirtualProtect(pBase, 5, oldProtect, oldProtect);
end;

function GetUStrCatAddr: Pointer; assembler;
asm
  lea  eax,System.@UStrCat
end;

initialization
  MonkeyPatch(GetUStrCatAddr, @_UStrCatMonkey);
end.

This solution does give me the heebie jeebies, because we are patching the symptoms of the problem as we’ve seen them arise, without being able to either understand or address the root cause within the compiler. It’s not really possible to guarantee that there won’t be some other code path that causes this solution to come unstuck without really digging deep into the compiler’s code generation.

As noted, this problem appears to be resolved in Delphi XE5 or possibly earlier; however it is unclear if the root cause of the problem has been addressed, or we just got lucky. The issue has been reported as RSP-10255.

How to render combining marks consistently across platforms: a long story

January 7, 2015Computing, Development, UnicodeMarc Durdin

I read today Richard Ishida’s notes on his experiences displaying combining characters in a consistent way across browsers, and recalled with fondness the pain we have experienced with this, both in browsers and in apps, over many years.

When we design a keyboard layout for Keyman, we will often want to show a diacritic mark or combining character from a complex script by itself on a key cap, or show the mark with a substituted placeholder symbol that is distinct from the script itself, to show how the combining character fits with the base letter.

A commonly used placeholder symbol is ◌ U+25CC DOTTED CIRCLE, but there are others that are preferred for some languages.

I show below three attempts at presenting the combining marks on our Open Source KeymanWeb platform keyboards, running in Chrome on Windows 8.1. The first shows a Lao keyboard, with each of the combining characters presented around a base ກ U+0E81 LAO LETTER KO on the keyboard. As you can see, this is not wonderful because the Ko Kai letter adds noise and makes it hard to pick out the combining characters.

The second keyboard is Thai. This relies on the system renderer and does not include a base character for the combining characters. In this case, that means that no base letter is shown and the marks show reasonably clearly. Unfortunately, this is not reliable across platforms, browsers or languages — it happens to work okay for Thai in Chrome on Windows 8.1. On some other systems, the combining marks are shifted to the far left of the key cap, or even off the key entirely.

The third keyboard example is for Gujarati, and again relies on the system renderer and does not include a base character for the combining characters. Now this time you see that the Uniscribe renderer on Windows 8.1 automatically inserts a dotted circle glyph — this may or may not be the same as the U+25CC character. Again, this behaviour differs across platforms.

The Story Today

The story today is pretty unfortunate. Display of isolated combining marks is not implemented at all consistently across the plethora of rendering engines, operating systems, browsers and fonts. Some systems will automatically insert a placeholder glyph if no conforming base character is found prior to the combining character. Others don’t. What do I mean by a conforming base character? Well, that’s up to the implementation of the renderer. Some renderers will treat any base character that is not in the same script as the combining character as part of a separate run, and won’t attempt to combine. Others will combine with some outside-script characters but not others.

Things get even worse when working with multiple combining characters without a base character. Unless this has been specifically handled in the renderer, you will typically see poor layout or rendering of the characters that may not be representative of how they form above normal base characters.

The image below shows examples of the marks U+0EB5 LAO VOWEL SIGN II, U+0EB5 LAO VOWEL SIGN II + U+0EC8 LAO TONE MAI EK and U+0EB3 LAO VOWEL SIGN AM, rendered with and without U+25CC DOTTED CIRCLE, on various systems. A more comprehensive set of examples is included later in this blog.

Sample Lao combining character renders — Sample Lao combining characters rendered

You can see, in the image above, examples of:

Repeated dotted circles
Diacritics that don’t align over their base character correctly
Rendering errors leading to tofu
Diacritics that don’t stack
Marks that overlap incorrectly

None of the basic text-only solutions work consistently across all systems.

It’s like stuffing a live octopus into a box

Attempting to address the problem is not fun. Fixing the problem on one browser/renderer/OS/font system invariably creates a new problem on another system. It’s like stuffing a live octopus into a box: each time you manage to get one of those pesky tentacles into the box, two others have snuck out the other side.

A Solution

We worked hard to find a consistent cross-browser solution. We started with Lao, but we believe that the same principles can be applied to other scripts. The solution relies on web fonts, but falls back to an acceptable solution in the shrinking subset of systems where web fonts are not supported.

We started by creating a font that is designed for use specifically with the On Screen Keyboard — it would never be used for presentation or editing. While at first glance this meant we could easily solve the problem by placing all the required marks into the Private Use Area, doing so would mean that on systems where the font was not available, the keyboard would be unusable.

Our final solutionhack uses the OpenType GSUB feature to replace the ກ U+0E81 LAO LETTER KO with a placeholder mark when, and only when, it is used as a base character for a combining mark. The keyboard presentation code stores the Ko Kai letter as a base character for each combining character (or characters), and at render time the font substitutes its placeholder mark. When a key with this pair of characters is clicked, only the combining mark is displayed.

Doing this means we can use the Ko Kai letter by itself on the keyboard (on the D key) without a problem (for example on its own key), and in the case where the special font is not available, the user will see an acceptable fallback with the Ko Kai letter as the base character for combining characters.

It’s worth noting that many renderers require that the placeholder character be the same font and style as the combining characters. Today, this sadly precludes using a different colour for the placeholder character.

Examples

I have collected screenshots from a variety of browsers that illustrate some of the inconsistencies and frustrations, as well as the happily correct rendering. You can also visit the sample page to see how well the rendering works in your system.

The screenshots include three combinations:

A simple single U+0EB5 LAO VOWEL SIGN II
A pair of diacritic marks U+0EB5 LAO VOWEL SIGN II + U+0EC8 LAO TONE MAI EK
A third combining character U+0EB3 LAO VOWEL SIGN AM

The sample shows these three combinations with the default font for the system, a Lao OpenType font “Saysettha Web”, and the special font “SaysetthaX Web” which has our hack in it, along with three different base characters: a hard-coded U+25CC DOTTED CIRCLE, a U+00A0 NO-BREAK SPACE, and a U+0E81 LAO LETTER KO.

Initially, the dotted circle with the Saysettha Web font looks hopeful. But in the end you will see that only one solution works on all browsers tested: the bottom right cell with the custom font and the Lao Letter Ko. The “hollow x” glyph that we use in this font is similar to the placeholder used in Lao language primers. However, any shape can of course be used here if you are creating your own font.

Windows 8.1 – Internet Explorer

Internet Explorer 11 on Windows 8.1 does surprisingly badly here — worse than earlier versions of Windows. Notice how the default Lao font (DokChampa) does not combine with dotted circle, nor does it combine the pair of diacritics correctly with no-break space.

Windows 8.1 – Chrome

While Chrome stacks the pair of diacritics correctly in all cases, the positioning for Dok Champa relative to the base character is incorrect for both dotted circle and no-break space. You can also see how the Saysettha Web and SaysetthaX Web fonts position their diacritics to the left of the cell in the case of the no-break space.

Windows 8.1 – Firefox

Firefox renderers similarly to Chrome. Note that it appears to be using Lao UI instead of DokChampa for the default font.

Windows 7 – Internet Explorer

Windows 7 – Chrome

(Font smoothing was not enabled on the remote connection when I captured the screenshot)

Windows 7 – Firefox

(Font smoothing was not enabled on the remote connection when I captured the screenshot)

Windows XP – Firefox

(Font smoothing was not enabled on the remote connection when I captured the screenshot)

Mac OS X – Safari

Here we see the Saysettha Web solution with dotted circle, which worked consistently in Windows, fails in Mac OS X. Cross-platform support comes back to haunt us. Get back into that box, O tentacle!

Mac OS X – Chrome

Mac OS X – Firefox

It appears that Firefox is having trouble loading the Saysettha Web font, although the SaysetthaX Web font is loading, so this problem is probably resolvable.

iOS 8.1 – Safari

This is at least consistent with the situation in Safari on Mac OS X 10.8!

Windows Phone 8.1 – Internet Explorer

This is similar, but not identical to Internet Explorer 9 on Windows 7. That would be too easy!

Android 4.3 – Android Browser

Clearly, no Lao font was available on the system, so the web fonts were the only ones that worked.

Android 4.3 – Chrome

Clearly, no Lao font was available on the system, so the web fonts were the only ones that worked.

Android 4.3 – Firefox

Clearly, no Lao font was available on the system. It also appears that Firefox is having trouble loading the Saysettha Web font, although the SaysetthaX Web font is loading, so this problem is probably resolvable.

Conclusion

In summary, the only solution today for on screen keyboards (or character pickers) that works consistently across all browsers is to create a custom font with what can only be classed as a hack, and supply that font for use within those pickers / on screen keyboards.

I certainly agree with Richard that a more formalised solution based on U+00A0 no-break space or U+25CC dotted circle would be fantastic.

My Disordered Rebuttal to “AngularJs Performance in Large Applications”

December 15, 2014Computing, Development, Griping, JavaScript, PontificatingMarc Durdin

After reading Abraham Polishchuk’s article AngularJS Performance in Large Applications, I wanted to dig a bit deeper on his suggestions before I took any of them on board. My basic concern was many of his suggestions seemed focused around technology-specific performance micro-optimisations. In today’s browser world, optimising around specific technologies (i.e. browsers and browser versions) is an expensive task with a limited life and usefulness: both improvements in the Javascript engines and differences in browser implementations mean that these fine-grained optimisations should really be limited to very specific performance critical situations.

Conversely, I don’t think we can expect Angular’s performance to improve in the same way as browser engines have. While Angular has undergone significant iterative performance improvements in the 1.x series, it is clear from the massive redesign of version 2.0 that most future application performance gains will only be won by rewriting significant chunks of your own code. Be aware!

Understanding computational complexity is crucial for web developers, even though with Angular it is all Javascript and browser and cloud and magic. You still need that software engineering education.

Onto the article.

1-2 Introduction, Tools of the Trade

No arguments here. Helpful.

3 Software Performance

Ok.

4.1 Loops

This advice, to move function calls out of loops, really depends on what the function you are calling is. Now, clearly, it’s sensible to move calls out of a loop if you can – that’s a normal O(n) to O(1) optimisation. And Objects.keys, as shown in the original article, is very heavy function, so that’d be a good one to avoid.

But don’t take this advice too far. The actual cost of the function call itself is minimal (http://jsperf.com/for-loop-perf-demo-basic/2). In general, inlining code to avoid a function call is not advisable: it results in less maintainable code for a minimal performance gain.

4.2 DOM access

While modifications to the DOM should be approached carefully, the reality is a web application is all about modifying the DOM, whether directly, or indirectly. It cannot be avoided! Applying inline styles (as opposed to using CSS selectors predefined in stylesheets?) vary significantly depending on both the styles being set and the browser. It’s certainly not a hard-and-fast rule, and not really a useful one. In the example, I compare background color (which should not force a reflow), and font-size (which would): (http://jsperf.com/inline-style-vs-class). The results vary dramatically across browsers.

4.3 Variable Scope and Garbage Collection

No arguments here. In fact I would go further and say that this is the single biggest performance concern, both in terms of CPU and resource usage, in Angular. In my limited experience, anyway. It’s very easy to leak closures without realising.

4.4 Arrays and Objects

Yes. Well, somewhat. As Gregory Jacobs commented, there is no significant difference between single-type and multi-type arrays. The differences come in what you are doing with the members of that array, and again browsers vary. http://jsperf.com/object-tostring-vs-number-tostring

I would again urge you to avoid implementation-specific optimisations. Designing an application to minimize the number of properties in an object just because today’s implementation of V8 has a special internal representation of small objects is not good advice. Design your objects according to your application’s requirements! This is a good example of a premature optimization antipattern.

Finally, as Wills Bithrey commented, Array.delete isn’t really slow.

5.1 Scopes and the Digest Cycle

This is a clear description of Angular scopes and the digest cycle. Good stuff. Abraham does a good job of unpacking the ‘magic’ of Angular.

6.1. Large Objects and Server Calls

Ok, so in my app I return full rows of data, and there are a few columns I don’t *currently* use in that data. I measured the performance impact on my application, and the cost of keeping those columns was insignificant: I was actually unable measure any difference. Again, this is a premature optimization antipattern. Your wins here will ultimately not be from excluding columns from your data but from making sure your data is normalized appropriately. This is about good design, rather than performance-constrained design.

Of course, there are plenty of other reasons to minimize your data across the wire. But remember that a custom serializer for each database object is another function to test, and another failure point. Measure the cost and the benefit.

6.2 Watching Functions

Yes. Never is a little too strong for me, but yeah, in general, avoid binding to functions.

6.3 Watching Objects

Yes, in general, deep watching of objects does introduce performance concerns. But I do not agree that this is a terrible idea. The alternative he gives (“relying on services and object references to propagate object changes between scopes”) gets pretty complicated and introduces maintenance and reliability concerns. I would approach this point as potential low hanging fruit for optimisation.

7.1 Long Lists

Yes.

7.2 Filters

As DJ commented, filters don’t work by hiding elements by CSS. Filters, as used in ng-repeat, return a transform of the array, which may be what Abraham meant?

7.3 Updating an ng-repeat

Pretty much. I would add that for custom syncing to work, you still need a unique key, just as you do for track by. If you are using Angular 1.3, there’s no benefit to the custom sync approach. Abraham has detailed the maintenance cost of custom syncing.

8. Rendering Problems

ng-if and ng-show are both useful tools. Choose your weapons wisely.

9.1 Bindings

Yes.

9.2 $digest() and $apply()

Yes, if you must. This is a great way to introduce inconsistencies in your application which can be very hard to debug!

9.3 $watch()

I really disagree with this advice. scope.$watch is a very practical way of separating concerns. I don’t think it’s indicative of bad architecture, quite the opposite in fact! While trying to find the discussions that Abraham refers to (but sadly doesn’t link to), it seemed that most of the alternatives given were in fact in themselves bad architecture. For example, Ben Lesh suggested (although he has since relaxed his opinion) using an ng-change on an input field instead of a watcher on the target variable. Why is that bad? From an architectural point of view, that means each place in your code that you change that variable, you have to remember to make that function call there as well. This is antithetical to the MVC pattern and seriously WET.

9.4 $on, $broadcast, and $emit

Apart from the advice to unbind $on(‘$destroy’) which I don’t think is correct, yes, being aware of the performance impact of events is important in an event-driven system. But that doesn’t mean don’t use them: events are a weapon to use wisely. Seeing a pattern here?

9.5 $destroy

Yes. Apart from the advice to “explicitly call your $on(‘$destroy’)” — I’m not sure what this means.

10.1 Isolate Scope and Transclusion

Only one quibble here: isolate scopes or transclusions can be faster than occupying the parent scope, because a tidy directive can use $apply to just update its own isolated scope, and avoid the digest of the entire scope tree. Choose your tools wisely.

10.2 The compile cycle

Basically, yes.

11 DOM Event Problems

Yes, addEventListener is more efficient, but so is not using Angular in the first place! This results in a significant code increase and again you the benefit of Angular in doing so. Going back to raw DOM is always a trade-off. In the majority of cases, the raw DOM event model is a whole lot of additional work for little gain.

12 Summary

In summary, I really think this post is about micro-optimisations that in many cases will not bring you the performance benefits you are looking for. Abraham’s summary suggests avoiding many of the most useful tools in Angular. If you have to roll your own framework code to avoid ng-repeat, or ng-click, you have already lost.

Have I got the wrong end of the stick?

The Digger — by Hannah

October 25, 2014Family, Family|Bible|More, StoriesMarc Durdin

Once upon a time, the Durdin family were going for a little walk. Everything was lovely. The weather was beautiful and everyone was happy. It was a great time to go for a walk.

We saw a digger and went to investigate.

As we got close, all of a sudden, the sky went dark and the digger woke up!

It started to chase us! Peter and I ran away from the digger.

But the digger started chasing us all over town. We ran through a tunnel. Half way through, Peter discovered a puddle. This was a problem.

He could not resist splashing in the puddle with his Bright Red Boots!

The digger was gaining on us. Quickly we grabbed Peter and rushed out of the tunnel.

We hid in a big willow tree and hoped the digger would not spot us.

But the digger did. So we ran. We soon reached another tunnel.

This one was a little smaller and we hoped the digger would get stuck. But it didn’t. So Peter scrambled for his life up the steps to join his sister.

I scrambled after him, and Dad followed, the digger snapping at his heels.

The digger forced us to the entrance to a maze. Perhaps we’d lose the digger here! Peter went first.

Even though we were in trouble, I still laughed at his cuteness as we raced through the maze.

Eventually we got through the maze, with Peter’s amazing sense of direction helping us. But unfortunately I think Peter had passed his sense of direction onto the digger, because it was still just behind us.

I told Peter to climb up on a stump. Surely the digger wouldn’t be able to climb up here!

We all climbed up after Peter.

But Peter wanted to get down.

So we all had to hop down after him. We ran and ran, wondering where to go next. My sister and I scrambled up a tree, and dad and Peter went down to the river.

Splitting up was the answer! The digger didn’t know which way to go. So it exploded.

After that, we decided to have a look around. We saw some amazing views.

THE END