Comparing TStringStream vs TStringList for writing Unicode strings to streams

There are two methods widely used in Delphi code for reading and writing strings to/from streams with Delphi, that initially seem pretty similar in their behaviour.  These are TStrings.SaveToStream and TStringStream.SaveToStream (or SaveToFile in either case):

procedure TForm1.SaveToFile;
const
  AString = 'This is some Unicode text'#13+
            'Test Unicode © Δ א';
begin
  with TStringList.Create do
  try
    Text := AString;
    SaveToFile('TStringList UTF8.txt', TEncoding.UTF8);
  finally
    Free;
  end;

  with TStringStream.Create(AString, TEncoding.UTF8) do
  try
    SaveToFile('TStringStream UTF8.txt');
  finally
    Free;
  end;
end;

But there are several crucial differences in what is written to the stream between these two methods:

  1. TStringList prepends the preamble bytes for the encoding (in this case, #$EF#$BB#$BF)
  2. TStringList appends a new line #$0D#$0A to the file, if your text does not already end in a new line.
  3. TStringList converts any single line breaking characters in the text (e.g. #$0D or #$0A) into #$0D#$0A.

The following hex dumps may show this more clearly:

EF BB BF 54 68 69 73 20 69 73 20 73 6F 6D 65 20
55 6E 69 63 6F 64 65 20 74 65 78 74 0D 0A 54 65
73 74 20 55 6E 69 63 6F 64 65 20 C2 A9 20 CE 94
20 D7 90 0D 0A 

TStringList UTF8.txt

54 68 69 73 20 69 73 20 73 6F 6D 65 20 55 6E 69
63 6F 64 65 20 74 65 78 74 0D 54 65 73 74 20 55
6E 69 63 6F 64 65 20 C2 A9 20 CE 94 20 D7 90

TStringStream UTF8.txt

Make sure you know how your files will be read and whether these differences are important to the target application.

Basically, TStringList is typically not appropriate for streaming strings without modification.  TStringStream is your friend here.  But if you need the preamble, and just the preamble, then you’ll have to do a little more work; you won’t be able to use TStringStream.SaveToFile.

2 thoughts on “Comparing TStringStream vs TStringList for writing Unicode strings to streams

  1. ” But if you need the preamble, and just the preamble, then you’ll have to do a little more work; you won’t be able to use TStringStream.SaveToFile.”

    Sure you can, you just need to add the encoding to the start of the stream.

    E.q. #EF#BB#BF

  2. Thanks Anonymous, you are correct, you can of course modify your TStringStream before saving it with SaveToFile.

    But #$EF#$BB#$BF (use the constant sUTF8BOMString in System.WideStrUtils.pas) is incorrect — you need to prepend the UTF-16 equivalent #$FEFF (use the somewhat less obvious constant BOM_LSB_FIRST in System.Types.pas) because TStringStream is passed a UnicodeString in its constructor:

    with TStringStream.Create(BOM_LSB_FIRST + MyString, TEncoding.UTF8) do
    try
    SaveToFile(‘myfile.txt’);
    finally
    Free;
    end;

Leave a Reply

Your email address will not be published.