Mixing RTL and LTR: Plaintext vs HTML

Just a very short post today.  We still have some way to go in mixing RTL and LTR text.  For example, the following image, snipped from Outlook 2010, shows the issue:

The subject and body both say the same thing, but the display order is different.  Do you know why?  It’s because the subject is plain text and is assuming that the text is primarily right-to-left, whereas the body is HTML, and is assuming that the text is primarily left-to-right.

Note how the full stop in the subject appears to the left of the English text.  This is because the display renderer has assumed that the whole run of text is right-to-left, so punctuation is treated as right-to-left, and so displays after (in a right-to-left sense) the text.

The question is, of course, how do you determine directionality given an arbitrary plain text string?  It’s not really possible to do so reliably in the absence of other metadata.  The W3C article on directionality is helpful here: http://www.w3.org/TR/html4/struct/dirlang.html

Another view of the message:

Interestingly, Outlook Web Access does not do this, because its UI takes its directionality from the base HTML document:

2 thoughts on “Mixing RTL and LTR: Plaintext vs HTML

  1. Hi Marc a few new things in HTML5 and CSS3 that will help when finalised and browsers support it: the bdi element and the CSS3 attribute/value pairs dir:auto
    unicode-bidi:isolate, unicode-bidi:plaintext and unicode-bidi:isolate-override

    Assuming the specs don’t undergo major revision between now and then.

    Also more additions and changes to the UBA slated for Unicode 7 probably relating to isolation characters.

  2. Thanks Andrew 🙂 Unfortunately not sure if the updates to the Unicode Bidi Algorithm will make any difference as to how the message subject is displayed in Outlook in a case like this! What do you think? As far as I can tell, if Outlook is neutral about text direction in the subject line, then situations like this will go wrong?

Leave a Reply

Your email address will not be published.