[thelist] MS Word html stripper

Tony Crockford tonyc at boldfish.co.uk
Fri Jul 30 11:53:18 CDT 2004


At 17:31 on Friday, 30 Jul 2004, Robert Crawford wrote:

> Hey all,
>
> I'm looking for a program to take a word document saved as html and  
> strip all of word's extra code out of it.

I've had good success with html tidy.

http://tidy.sourceforge.net/

much better results can be achieved by opening the word document in  
OpenOffice, save as html from there, then clean up with html tidy.

FWIW html tidy is integrated into topstyle pro 3.0, which makes the  
cleanup easier still.


Dreamweaver users also have options to clean word junk (not sur eof the  
menu options ;o)  )


hth


More information about the thelist mailing list