[thelist] batch conversion of word to html

Shirley Kaiser (SKDesigns) skaiser at skdesigns.com
Thu Dec 7 19:01:56 CST 2000


I've also done this with HomeSite and HTML Tidy (David mentioned HTML 
Tidy and BBEdit). HTML Tidy has a feature specifically designed to clean 
up Word documents, and it does a pretty nifty job of it, too.

I haven't worked with regular expressions yet, although I see it in my 
near future. I've done the above just using HTML Tidy and then 
HomeSite's search/replace if needed (using the Project feature, which 
will search/replace for an entire web site, for example).

There's a Dreamweaver discussion list that could undoubtedly answer this 
question more specifically, too. It's listed in Discussion Lists at 

Your request is not at all an unusual situation. I'll be curious to know 
what you find out with Dreamweaver, if you remember to let us know.

Shirley E. Kaiser, M.A.
SKDesigns mailto:skaiser at skdesigns.com
Website Development http://www.skdesigns.com/
Pianist, Composer http://www.shirleykaiser.com/

McCreath_David wrote:

> Hi, Zach --
> I've gone down this path before, and although I never tried AppleScript, I
> never found a super-duper automated way to do it.
> Two things that have been suggested on this topic on the list before are
> HTMLTidy and BBEdit. I ended up using BBEdit and it's RegEx search and
> replace feature. It's not exactly automated, but you can save your
> expressions for future use and BBEdit will fairly rip through dozens of HTML
> files in a batch.
> It's worth looking at if for no other reason than it's a good intro to
> Regular Expressions if you've never used them before.
> Dave
>> I've dug up an AppleScript that will convert a batch of Word files to HTML,
>> using Word's "save as HTML" feature. What I'm looking for now is a way to
>> script Dreamweaver so a batch of HTML files can be cleaned up using the
>> "clean up Word HTML" feature.
>> Dreamweaver doesn't appear to have much of an AppleScript dictionary...does
>> anyone have any thoughts on this subject?

More information about the thelist mailing list