Demystifying printing with the Microsoft WebBrowser control and ShowHTMLDialogEx

I’m writing up these notes in order to document what has been a long and painful process, involving much spelunking through MSHTML.DLL and IEFRAME.DLL to try and understand what Internet Explorer (or more accurately, the WebBrowser control) is doing and how to correctly use the semi-documented interfaces to provide full control over a print job.

The original requirement for this mini-project was to provide tray, collation, and duplex control for a HTML print job using IHTMLDocument2.execCommand(IDM_PRINT), with a custom print template.  These functions had been supported through a 3rd party ActiveX component, but this component proved to be incompatible with Internet Explorer 9 (causing a blue screen would you believe!), and the company providing the component was defunct, so it fell to me to re-engineer the solution.

After considerable research, I found some sparse documentation on MSDN suggesting that one could pass a HTMLDLG_PRINT_TEMPLATE flag to ShowHTMLDialogExand thereby duplicate and extend the functionality of the print template.  In particular, the __IE_CMD_Printer_Devmode property that could somehow be passed into this function would give us the ability to control anything we liked in terms of the printer settings.

Too easy.  Much too easy.  The first stumbling block was trying to discover the type of the pvarArgIn parameter to ShowHTMLDialogEx. A variant array seemed sensible but did not work.  It turns out that this needs to be an IHTMLEventObj, which can be created with IHTMLDocument4.CreateEventObject.  You can then use IHTMLEventObj2.setAttribute to set the various attributes for the object.

Then there were questions about what IMoniker magic was needed for the pMk parameter.  And more questions about the most appropriate set of flags.  Diving into the debugger to examine what Microsoft did answered both of these questions — it was a simple CreateURLMonikerEx call, no need to bind the moniker or other magic, and the flags that Microsoft used were HTMLDLG_ALLOW_UNKNOWN_THREAD or HTMLDLG_NOUI or HTMLDLG_MODELESS or HTMLDLG_PRINT_TEMPLATE for a print job, or HTMLDLG_ALLOW_UNKNOWN_THREAD or HTMLDLG_MODAL or HTMLDLG_MODELESS or HTMLDLG_PRINT_TEMPLATE for a print preview job.  Yes, that is both HTMLDLG_MODAL and HTMLDLG_MODELESS!

Next, what variant type should the __IE_BrowseDocument attribute be?  VT_DISPATCH or VT_UNKNOWN?  The answer is VT_UNKNOWN — things just won’t work if you pass a VT_DISPATCH.  I also came unstuck on the __IE_PrinterCmd_DevMode and __IE_PrinterCmd_DevNames attributes.  These need to be a VT_I4 containing an unlocked HGLOBAL that references a DEVMODEW structure.  I’ll leave the setup of the DEVMODEW structure to you: there are a lot of examples of that online.

However, even after overcoming these hurdles (with copious debugging to understand what MSHTML.DLL and IEFRAME.DLL were doing), there were other issues.  First, the print template was unable to access the dialogArguments.__IE_BrowseDocument property, with an Access Denied error thrown.  Also, HTC behaviors would fail to load as the WebBrowser component believed that they were being referenced in an insecure, cross-domain manner.  And finally, JavaScript in the page being printed was failing to run — and this JavaScript was required to render some of the details of the page.

I knew that Microsoft actually pass a reference to a temporary file for printing in the __IE_ContentDocumentURL attribute.  So I saved the file to a temporary file, which also required adding a BASE element to the header so that relative URLs in the document would resolve.  But the problems had not gone away.

All three of these problems in reality stemmed from the same root cause.  The security IDs for the various elements — the print template, the document being printed, and the HTC components — were not matching.  So I embarked on an attempt to find out why.  At first I wondered if we needed to bind the moniker to a bind context or storage.  That was a no-go.  Then I looked at the IInternetSecurityManager interface, which a developer can implement to provide custom security IDs, zones and more.  Sounds logical, right?  Only problem is that the ShowHTMLDialogEx function provides its own IInternetSecurityManager implementation, which you cannot override (and its GetSecurityID just returns INET_E_DEFAULTACTION for the relevant URLs).  Yikes.

I was starting to run out of options.  As far as I could tell, we were duplicating Microsoft’s functionality essentially identically, and I could not see any calls which changed the security for the document so that it would match security contexts.

Finally I noticed an undocumented attribute had been added to the HTML element in the temporary copy of the page: __IE_DisplayURL.  And as soon as I added that to my file, referencing the original URL of the document, everything worked!

Now, this is all fun (and sounds straightforward in hindsight), but without some code it’s probably not terribly helpful.  So here’s some code (in Delphi, translate to your favourite language as required).  It all looks pretty straightforward now(!), but nearly every line involved blood, sweat and tears!  This is really not a complete example and hence does not compile but just covers the bits necessary to complement the better documented aspects of custom printing with MSHTML.  Please note that this example uses the TEmbeddedWB component for Delphi, and that temporary file cleanup has been excluded.

Update 14 July: This code is not our production code: I’ve stripped out bits and pieces and tried to keep the bits that are somewhat relevant. Don’t worry too much about the ConfigurePrinter details — the takeaway is the HGLOBAL. I must also apologise for the atrocity that is the SaveToFile function. That’s what you get when working with legacy versions of software. Internet Explorer also won’t reliably work with non-ASCII content there unless you toss a BOM into the start of the stream.

16 thoughts on “Demystifying printing with the Microsoft WebBrowser control and ShowHTMLDialogEx

  1. This exploration you’ve made is awesome! Your insight and sample code are priceless. I’ve tripped over print templates too and I never got to set the default printer settings. I’ve found printer setup dialog hacks that depend on the underlying Windows version, but this post is surely the best I’ve found on the topic!

  2. Thanks Paulo :-)

    Since I published this post I have tweaked a couple of things:

    * It’s safer to use UTF-8 and put a UTF-8 byte order mark on the temporary file in the SaveToFile function. This resolved some issues we experienced with Unicode characters.

    * It does not appear to be necessary to use a short file name for the __IE_ContentDocumentUrl attribute.

    You may also find the following post helpful: Problems with Internet Explorer 8, print templates and standards compliance

    1. Thank you very much for this description. Do you know if it is possible to make it work without temporary html-file. I create the html document dynamically. So i have an IHTMLDocument2. If I pass this document as argument __IE_BrowseDocument without specifying a ContentDocumentUrl, Print preview has empty pages. I would expect that BrowseDocument specifies the document and there is no need for the URL.
      As print template i tested with the templates of the printtemplate.exe examples of msdn.
      I also tried with exec-command but it always creates a temporary copy of my html file.

      I you have a suggestion please let me know.
      Thank you very much.

      Daniela

  3. Daniela, I’ve not found any way to do this without a temporary html file: IE always creates a temp file when it is printing, so I followed its print method. Is there a reason why you can’t save the file to disk temporarily?

  4. Hi Marc,
    thanks for your reply. I would like to avoid saving to disk because actually there would be no need for that. I create the html source in my code and I don’t want to save this data in plain text to disk.
    I don’t understand why this is necessary beacause with the argument “__IE_BrowseDocument” you have the document. If I test with template7 of the printtemplate.exe-Example (you know what I mean?), I cannot find a reason why this temporary file is necessary. But If I don’t set the argument __IE_ContentDocumentUrl, there is a blank page.
    Thank you very much.

  5. The save-to-disk behaviour emulates what Internet Explorer does. That gave us what we needed, and I don’t know the justification or reasoning behind the use of both the __IE_ContentDocumentUrl and __IE_BrowseDocument arguments. Not sure I can help more than that.

  6. Hi Marc,

    Your DevMode setting analysis is very impressive. I tried to resolve this for years. The best I could do to change default printer before printing then change it back after printing. This is much simpler but not nearly as elegant as your solution.

    Now I spotted another problem. When printing with custom template IE creates temporary file in TEMP folder. File is not deleted after printing completed. This is easily demonstrated even with the printtemplates.exe MS Sample – Template 2. Though when you prints from IE application with default template file cleaning is perfect.

    To make it worse my pages include ActveX controls. Printing creates temporary .emf file per every ActiveX instance which also not cleaned after printing. When prints huge document it may leave 1000th .emf files in TEMP directory. Again standard IE with default template cleans files perfectly.

    Can you please use your expertise to look at what is lost from default template functionality. Why custom template does not clean temporary files?

  7. Thank you Arkady — it was certainly quite a long process to understand how it all worked. My experience with temp files has been mixed, but most important is making sure all COM references to IE are properly released before exiting the app — so destroy the control and release any interfaces, as this does do the cleanup in my experience.

  8. Hi Marc,

    I am trying to implement your solution in C#. However I can not figure out what GetPrintTemplateURL is doing. Help?

    Thanks,

    Brad

  9. Brad, GetPrintTemplateURL is just returning the URL to our own print template — in our case, something like ‘http://localhost:port/printtemplate.html’

  10. Hi Marc,
    Do you have experiences with users which can’t print because print doesn’t start or it crashes completely. Other users with same IE-Version and same Windows version can print without any problems.
    You know any reason for this issue?

    Thanks,
    Dani

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">