[thelist] Re: htaccess & robots.txt - blocking accesses to images from outside

Paola Kathuria paola at limitless.co.uk
Tue Jun 26 10:06:00 CDT 2001


Following up to my own question about blocking images with
.htaccess, looking at my site's logs I realised that the
Google URL matches   ^http://.*limitless.co.*/.*paola/.*$

(Here's the full URL from Google accessing an image on my
site: 
http://images.google.com/imgres?imgurl=www.limitless.co.uk/~paola/wallpapers/1024x768/fruit-frac.jpg&imgrefurl=http://www.limitless.co.uk/~paola/wallpapers/show.lml%3Fpic%3Dfruit-frac&h=768&w=1024&prev=/images%3Fq%3Dfruit%2Bwallpapers%26num%3D50%26hl%3Den%26safe%3Doff)

I've therefore changed the http://.* to http://[^?]* so that
accesses to images in URLs containing ? before our domain are
also blocked.

I also came across a 'more from this site' link on Google's
image search and so found that one can display images from a
single site in one page.  For example, searching on evolt for
pages/images (?) including the word 'evolt' finds 63 images:
http://images.google.com/images?q=+site%3Aevolt.org+evolt

I think that the forbid rule for evolt.org would be:

RewriteEngine On
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://[^?]*evolt.org/.*$ [NC]
RewriteRule .*\.(jpg|gif)$        -                     [F]


Paola




More information about the thelist mailing list