[thelist] Housecleaning tool.....

Sam-I-Am sam at sam-i-am.com
Tue May 8 10:24:07 CDT 2001

> Is there any type of tool that can tell me specifically which images and
> pages are actually being used?

this is something of a holy grail for me. But there are tools and
techniques that can help.
Tools like linkbot (http://www.tetranetsoftware.com) will crawl your
site and if you show it the root directory (ftp/locally) report on orpan
files (the name for all the files that aren't being used). 

But before you go and delete them can you be sure the tool has done its
job right?

It all gets a lot easier if you have static content & pages. 
If you have depandancies being built on the fly - like with javascript -
then I know no tool that will help. 
Other gotchas are things like CSS imports, background-images. etc. 
I usually end up going back through manually to double check.

(I want to build this tool though.. seems like you could have a proxy
server that logged each request and built a (directory or other) tree of
all files - either the actuall files, or just the filenames).. then you
could point a spider at it, and also point human testers - to mouseover
every mouseover and generally interact with it. Doing a directory
compare would show you the true orphans. 
It would also be a good offline browser / site snagger.

(IE has some but not all of this functionality. It doesn't reserve the
original directory structure though..)


More information about the thelist mailing list