[thelist] robots and the whitehouse.gov site

Chris Beaumont chris at ncafe.com
Mon Feb 2 23:49:06 CST 2004

Its possible that an entity might want to prevent cached content, say 
on Google or Archive.org's "Wayback Machine"  from being compared to 
current content,
because that can indicate some things which might be embarassing.

For example, the White House has been systematically trying to get 
Internet outlets to remove specific pieces of information that show 
past White House policies and speeches on various things, notably Iraq.

For example, see this URL:


There are more examples..many of them, unfortunately...

On Monday, February 2, 2004, at 08:14  PM, David Bindel wrote:

> On Mon, 2004-02-02 at 01:02, Brian Cummiskey wrote:
>> My question is, why would anyone ever want to disallow their content
>> from being indexed?  I can understand the want to disallow
>> yourdomain.com/personalstuff or something like that, but why pretty 
>> much
>> all your content?
> My guess is that they want to severely limit the number of entry points
> to their website so that most visitors will enter at the homepage.
> I can't really think of a practical reason that would impel me to do
> that to one of my own sites, but then again, I'm not a big, famous
> entity like the White House.

