[thelist] [SPAM] cron job for crawling web pages

VOLKAN ÖZÇELİK volkan.ozcelik at gmail.com
Thu Apr 6 02:41:54 CDT 2006


2006/4/6, Anthony Baratta <Anthony at baratta.com>:
>
> What are you going to process the pages for? If you use Swish-e, you can
> spider the pages and index them into an index file or mySQL.


I will be crawling three or four large sites which have a lot of traffic.
The sites basically post sector-related info, news, and job ads - other
valuable content and are updated at least on an hourly basis.

I cannot give much detail about the business model though. Because it is a
confidential data until the first beta of this site I'm working on launches.

http://swish-e.org/


Thanks, I'll give that a try. But first I need to get my hands dirty with
fedora.

Cheers,
--
Volkan Ozcelik
+>Yep! I'm blogging! : http://www.volkanozcelik.com/volkanozcelik/blog/
+> My projects/studies/trials/errors : http://www.sarmal.com/



More information about the thelist mailing list