[thelist] strip html etc

Jorah Lavin madstone at madstone.net
Sat Aug 23 08:49:43 CDT 2003


At 22:18 08/21/03, george donnelly wrote:

>I'm implementing a zope solution for a client in which content and
>presentation will be separate. The client has an old website where each page
>has content and presentation.
>
>My current task is to separate the content from the presentation in about
>5000+ html pages so that it can be dumped into the new site.

Doesn't Tidy have an option for upgrading all (HTML 3.2, 4.01, MS Word 
output, and so on) nasties to XHTML? Wouldn't surprise me if you could run 
it on a big batch of files somehow. It only recently moved away from 
command-line, if I recall correctly.

http://www.w3.org/People/Raggett/tidy/


-Jorah 



More information about the thelist mailing list