Delphi’s documentation on IXMLDocument.SaveToStream has the following important caveat:
Regardless of the encoding system of the original XML document, SaveToStream always saves the stream in UTF-16.
It’s helpful to have notes like this. Mind you, forcing UTF-16 output is definitely horrible; what if we need our document in UTF-8 or (God forbid) some non-Unicode encoding?
Now Kris and I were looking at a Unicode corruption issue with an XML document in a Delphi application and struggling to understand what was going wrong given this statement in the documentation. Our results didn’t add up, so we wrote a little test app to test that statement:
procedure TForm1.OutputEncodingIsUTF8; const UTF8XMLDoc: string = '<!--?xml version="1.0" encoding="utf-8"?-->'#13#10+ ''; var XMLDocument: IXMLDocument; InStream: TStringStream; OutStream: TFileStream; begin // stream format is UTF8, input string is converted to UTF8 // and saved to the stream InStream := TStringStream.Create(UTF8XMLDoc, TEncoding.UTF8);</blockquote> // we'll write to this output file OutStream := TFileStream.Create('file_should_be_utf16_but_is_utf8.xml', fmCreate); try XMLDocument := TXMLDocument.Create(nil); XMLDocument.LoadFromStream(InStream); XMLDocument.SaveToStream(OutStream); // IXMLDocument.SaveToStream docs state will always be UTF-16 finally FreeAndNil(InStream); FreeAndNil(OutStream); end; with TStringList.Create do try // we want to load it as a UTF16 doc given the documentation LoadFromFile('file_should_be_utf16_but_is_utf8.xml', TEncoding.Unicode); ShowMessage('This should be displayed as an XML document '+ 'but instead is corrupted: '+#13#10+Text); finally Free; end; end;
When I run this, I’m expecting the following dialog:
But instead I get the following dialog:
Note, this is run on Delphi 2010. Haven’t run this test on Delphi XE2, but the documentation hasn’t changed.
The moral of the story is, the output encoding is the same as the input encoding, unless you change the output encoding with the Encoding property, for example, adding the highlighted line below fixes the code sample:
XMLDocument := TXMLDocument.Create(nil); XMLDocument.LoadFromStream(InStream); XMLDocument.Encoding := 'UTF-16'; XMLDocument.SaveToStream(OutStream);
The same documentation issue exists for TXMLDocument.SaveToStream. I’ve reported the issue in QualityCentral.