[thelist] Logging stats - excluding robots?

Ian Anderson ian at zstudio.co.uk
Mon Oct 31 03:08:18 CST 2005


Hello everyone,

I am writing a simple stats logging feature for a Classic ASP web 
application, and I would like to exclude robots from the results so as 
to produce a more  accurate figure for visits by humans.

Sean Inman uses JavaScript to do this, making what I think is a 
reasonable assumption that the number of humans without JavaScript 
enabled is likely to be rather small.

However, I was pondering ways to validate this by other means than 
analysing UA strings, and am considering counting IIS sessions.

1. I was wondering if cookies would perform a similar function in 
filtering out robots - i.e. if the user agent is able to store cookies, 
are we able to assume that it is a human rather than a robot?

2. Does anyone know if IIS records a session on the first HTTP request 
made, if cookies are not enabled?

3. Following on from 1 and 2: Since I assume most robots (e.g. 
googlebot) do not support cookies, and hence cannot maintain an ASP 
session through a site, does this mean that analysing the number of 
sessions started is going to be massively inflated?

If a given robot does not support cookies and IIS starts a new session 
for every GET request, the figure returned for "Sessions" could be 
vastly inaccurate if, for example, half the visits are by robots and 
each file requested started a new session.

Haven't worked out how to test this yet - any comments or suggestions 
most welcome

Cheers

Ian


-- 
_________________________________________________
zStudio - Web development and accessibility
http://zStudio.co.uk

Snippetz.net BETA - Online code library
File, manage and re-use your code snippets online
http://snippetz.net




More information about the thelist mailing list