Does anyone have a link to exclusion syntax for robots? http://www.robotstxt.org is great but it doesn't actually give much syntax or anything. Like, what does this mean? Does this work for all images or is this a folder solution? User-agent: * # directed to all spiders, not just Scooter Disallow: /source Disallow:/photo Disallow:/images Disallow:/backontrack Disallow: /cgi-bin > From: Andy Warwick <mailing.lists at creed.co.uk> > Reply-To: thelist at lists.evolt.org > Date: Tue, 27 Nov 2001 04:14:35 -0500 > To: thelist at lists.evolt.org > Subject: Re: [thelist] Stopping robots.txt being read > > On 2001-11-26 at 22:40, cache at dowebs.com (Keith) wrote: > >>> Just been reading the rather scary article on CNET, about how Google >>> can be used to find passwords etc. >>> >>> http://news.cnet.com/news/0-1005-200-7946411.html?tag=tp_pr >> >> It must have been a slow news day for CNET to have published that >> - any search engine can be used that way, Google just makes it >> easy for someone to find sensitive information by mistake rather >> than by trying. > > I agree. Slow news day. The fact that you could search for non-text files was > news though. > >>> How does one go about stopping a robots.txt file being read in a >>> browser. Given the file has to be accesible to a search engine, >>> how do you protect it so that a human can't simply type in the >>> robots.txt URL manually, read the file, and make some educated >>> guesses about where stuff is on the server. >> >> Why bother? Anyone who wants a list of all the public files in a >> domain can do that pretty easily, robot.txt or not. >> >> Your concern here is based on a flawed concept of security, hide it >> and they can't find it. > > Security by obscurity is no security at all. I know all this. My concern is > not > because I'm hiding stuff like that, it's just one more thing to lock down if > possible to deter the casual snooper. Kinda like having an empty alarm box on > the wall of your house. Potential crook will go looking for easier pickings, > even if the real security value of the box is worth squat. > >> That doesn't work, never has, never will. If you >> want to secure a file, secure it, don't hide it. There are 3 basic >> methods for securing a file, hiding it is not one of them. > > Care to elaborate on the 3 ways... > >>> wouldn't dream of putting sensitive files in a public area, >> >> That's a good step, but placing a file outside of the domain path or >> placing the file behind Basic Authentication is not necessarily >> secure, unless you're the only user on the machine. > > If this stuff was really secure, I'd be putting it on it's own box, hosted on > site, firewalled from the main server, also hosted on site. As I said, it's > not. > But every locked door you can put between someone and what they shouldn't be > looking at, even if their is no real harm in looking, helps. Hiding the > robots.text file is an interesting exercise, and one more thing to tick off > the > list of potential backdoors for malicious 'script kiddies'. > >> If you want the >> file unavailable to anyone but the owner of the file, simply do not >> give it world or group permissions. > > I agree. And put it on an encrypted removable disk on a chain around their > neck. > > It's all a matter of degree and tradeoffs, but anything that will help - like > hiding robots.txt if possible, is worth doing IMHO. > > Andy W > > --------------------------------------- > For unsubscribe and other options, including > the Tip Harvester and archive of TheList go to: > http://lists.evolt.org Workers of the Web, evolt !