[thelist] Validation crawler?

Tim Beadle tim.beadle at iop.org
Thu Jun 10 08:20:02 CDT 2004


On Thu, 2004-06-10 at 13:49, Tim Fountain wrote:
> Does anyone know of a tool that will crawl a site and identify pages
> that don't validate?

Tim,

I took a script that Greg Osboinsky at the w3c had written and modified
it. The basic premise is that instead of spidering per se, it greps your
httpd logs and gets URIs to access that way, in order of popularity; the
theory being that the most popular pages are the ones you want to fix
first.

http://lists.w3.org/Archives/Public/www-qa/2001Sep/0031.html - links to:
http://lists.w3.org/Archives/Public/www-qa/2001Sep/att-0031/top-invalid-docs

HTH,

Tim
-- 
Tim Beadle <tim.beadle at iop.org>




Institute of Physics
Registered charity No. 293851
76 Portland Place, London, W1B 1NT, England

IOP Publishing Limited
Registered in England under Registration No 467514.
Registered Office: Dirac House, Temple Back, Bristol BS1 6BE England

This e-mail message has been checked for the presence of computer viruses.



More information about the thelist mailing list