[thelist] cron job for crawling web pages

Hassan Schroeder hassan at webtuitive.com
Thu Apr 6 03:41:08 CDT 2006


> The current project I'm working on requires connecting to certain provider
> sites and storing the raw html from those sites for later processing.

> Given that we will be using some *nix distribution ...

Probably available with your distro: <http://www.gnu.org/software/wget/>

> And finally, I will be running VMWare on my local machine (cuz I do not want
> to mess my win2k os since I'm developing .net projects there) Have you used
> it to deploy a *nix distribution. Did it cause any problems?

I use VMWare to run instances of XP and W2K on my SuSE machine so
that may not be exactly analogous, but it works great :-)

