>----- Original Message ----- >From: "Marc Seyon" <seyon at delime.com> >To: <thelist at lists.evolt.org> >Sent: Tuesday, October 30, 2001 5:47 PM >Subject: [thelist] site downloading tool > > >G'day all, > >I know this has been discussed, but not sure what to search for >in the archives. I'm looking for a tool to download an entire >site from the net. Yes, I have only honourable intentions :-) First, what are honorable intentions? It does not sound very fun... But yes, its called writing some sockets code. May i suggest going to yahoo and searching for two words "beej socket". A guy named "beej" has a page on the web that tells you how to write sockets code. If you are a windows coder, perhaps its not all that different. By writing your own sockets code you will find that you will have complete control over your spidering efforts. dont worry, its not really that difficult. I do it, and i am only twelve years old. good luck. >//------------------------------------------------------------ > >Hello all > >Is there a way to *prevent* users from downloading all files >at once for unhonorable intentions? > >TIA, Mike > > > Again, what exactly are UNhonorable intentions? But then, that does stink like fun... Why not serve up your pages via a cgi? Lets say that your home page is index.cgi And lets also understand that all of your pages will get served up via index.cgi... When the user first runs index.cgi, index.cgi will retrieve the incoming cgi data. From this data, lets look for a variable called hitctr. But maybe name it something else so that they dont even know what it is. And maybe encode it. The first time you run index.cgi hitctr will be empty. It wont contain anything. Thats a zero. Create an encoding (or encryption) scheme that only you know. Lets say that a zero encrypts to @34sdjasdf. Increment that to a one. Ok, now it encrypts to hitctr=Wats;pas;df. If the crawler does not know what you are doing, they wont be any the wiser. The best lock is a lock that nobody even sees. Cant break it if ya cant find it. baboom. Anyway, with each page call, increment the hitctr and when ever you go back to the home page perhaps, reset the hitctr to zero. This will allow normal users to cruise your site without restriction. But anybody that displays say ten pages without going back to the home page, maybe just redirect them back there anyway. Anyway, thats a possible solution. I really dont understand the specifics of your site and your needs, but anyway maybe this will help to get you started. rotsa ruck. me.