Microsoft Web Archive format (*.mht) as output option

Aug 12, 2008 at 3:49 PM
I've found it very useful to turn the *.htm and accompanying directories into Microsoft Web Archive format (*.mht).  The advantage to *.mht is that the resulting file has all of the image resources contained within it so you don't need to send both a HTML file and a directory to someone via email or store both artifacts on the network for posterity.  This format can be viewed on Windows through Internet Explorer and on OS X by giving it a *.eml extension by (I'm unaware of a Linux viewer however).  It would be nice to have *.mht as a third output option on the PAL wizard.  To create a *.mht file in C# (which is likely of limited use since PAL output is performed by VBScript, but I'm sure VBScript could invoke CDO):

1.  Add a reference to "Microsoft CDO for Windows 2000" to the project (interop).
2. Source code looks something like...

using System;
using System.IO;
using CDO;
const string __url = "file:///{0}";
var htmlFileName = "path to PAL output.htm";
var message = new MessageClass();
message.CreateMHTMLBody(String.Format(__url, htmlFileName), CDO.CdoMHTMLFlags.cdoSuppressNone, "", "");
var stream = message.GetStream();
var outputPath = Path.GetDirectoryName(htmlFileName) + "\\" + Path.GetFileNameWithoutExtension(htmlFileName) + ".mht";
stream.SaveToFile(outputPath, ADODB.SaveOptionsEnum.adSaveCreateOverWrite);
Aug 30, 2008 at 4:05 PM
I completely agree that MHT files are the way to go. When I try to save the PAL HTML file as an MHT file in Internet Explorer, the links in the TOC still point to the original file location.

Thank you for the code. Yeah, I'll need to try to do the same in VBScript. I didn't know there was an object model for it. I'll try playing with this after I finish doing a much needed rewrite of PAL in v1.4.

Thank you!