[thelist] Stats

Bob Meetin bobm at dottedi.biz
Thu Oct 1 12:04:59 CDT 2009

Hassan Schroeder wrote:
> On Thu, Oct 1, 2009 at 6:43 AM, Robert Lee <rob at rob-n-steph.net> wrote:
>> As for the raw logs, yes you would have to use an app to analyze them for
>> you.
> Nonsense. You can analyze log files other ways -- import them into a
> spreadsheet or database, or just do ad hoc slicing with shell utilities.
The standard apache log files seem to be space delimited, look something 
like: - - [31/Mar/2009:05:48:59 -0500] "GET 
/scripts/accordion/demo.php HTTP/1.1" 200 6384 "-" "Mozilla/5.0 
(Windows; U; Windows NT 5.1; sk; rv:1.9) Gecko/2008052906 Firefox/3.0"

You just have to figure out how to break up the fields so that the bits 
enclosed in double quotes are not treated as multiple fields.  Then you 
can do things like delete image files, css, javascript, misc config 
files until you get to the meat of the matter.  You can also identify 
bots and discount, eliminate stuff such as your IP hitting your own 
website.  As Hassan alluded to, UNIX commands such as sort, sort -u, sed 
and awk are your command line friends.

Your hosting provider may not have them enabled by default - they take 
ample disk space.  With some cPanels you just need to find the function 
to enable them.

Bob Meetin

Standards - you gotta love em with so many to choose from!
Rocket Science - the Art of Managing Distractions

More information about the thelist mailing list