[thelist] Crawling for headers

Peter Johansson peter at johansson.org
Tue Mar 19 07:15:01 CST 2002


Hi,

I'm in need of a tool that can crawl through websites and look for
specific meta-data in the pages. For instance to identify which pages that
have meta-data with expiry dates and that have already expired, or to
locate pages that don't contain specific meta-data at all. The more
generic the better.

It's intended for a large intranet containing of a significant number of
sites, hosted at separate locations, so a simple find and grep won't do.

Anyone knows of such a tool?

Regards,
Peter




More information about the thelist mailing list