[thelist] broken link checker software with restrict filters
Steve Axthelm
steveax at pobox.com
Thu Feb 19 12:23:40 CST 2009
On 2009-02-18 Stuart Young wrote:
>What do you use for checking for broken links ... except for Xenu, thanks.
>
>Xenu is great but it doesn't have filters to restrict which URLs to download
>... it does have a "Do not check any URLs beginning with this:" option, but
>the URLs I want to remove do not have a common start. I want to prevent the
>checking URLs with query parameters, e.g. to remove printable versions and
>email to a friend versions and so on. Specifically I want to run it on a
>mediawiki site which has literally hundreds of thousands of duplicate URLs
>containing &action=history, &action=edit, &oldid=, &diff= etc etc.
You didn't specify platform, but since you mention Xenu I
presume you're looking for Windows software? Thought I'd mention
Integrity[1] just in case OSX was a possibility. It has an
exclude field that takes a list of strings to exclude.
Also, the W3C LinkChecker perl module[2] has --exclude and
exclude-docs options which take regular expressions.
HTH,
-S
[1] http://tinyurl.com/3pmk2t
[2] http://search.cpan.org/dist/W3C-LinkChecker/bin/checklink.pod
-Steve
--
Steve Axthelm
steveax at pobox.com
More information about the thelist
mailing list