[thelist] extracting site "visits" from a table tracking page views (mysql)

Jewel Mlnarik jewel at mlnarik.com
Tue Aug 24 17:22:39 CDT 2004


Can you create a separate "visits" table?  In the past, I've created a cron
script that runs once a day that parses a page view table and updates a
separate "visits" table.

For example, if your page view table looks like:
UserID | Timestamp           | Country
-------+---------------------+------
  24   | 2004-08-24 01:00:00 | 1
  24   | 2004-08-24 01:05:00 | 1
  24   | 2004-08-24 06:30:00 | 1
  24   | 2004-08-24 06:35:00 | 1
  24   | 2004-08-24 06:40:00 | 1
  24   | 2004-08-24 10:35:00 | 1
  24   | 2004-08-24 12:50:00 | 1
  24   | 2004-08-24 12:55:00 | 1
  24   | 2004-08-24 18:00:00 | 1

Then your visit table for the same day could look like:
UserID | Date       | Country | Visits
-------+------------+---------+--------
  24   | 2004-08-24 |   1     |  3

The upside is that it's easy to pull quick stats w/out having to wait or
interfere while the site is writing to the page view table on every page
load.  (Becomes a problem on high traffic sites.)  I'd be interested to see
if there's a way to pull the stat you're looking for accurately in one
query.

~ jewel

>-----Original Message-----
>From: Sarah Sweeney [mailto:mr.sanders at designshift.com] 
>Sent: Tuesday, August 24, 2004 1:00 PM
>To: thelist
>Subject: [thelist] extracting site "visits" from a table 
>tracking page views (mysql)
>
>I'm totally stumped by this query I'm trying to write. I 
>currently have a table in my database that tracks the page 
>views by registered users of the site. This is used to compile 
>stats for the client such as frequency of site use by user 
>country, user organization, etc. I've been asked now to 
>compiled similar reports based on user "visits" rather than 
>individual page views. I'm not exactly sure how to classify a 
>"visit", but I'm thinking that I will say a new visit begins 
>when a user accesses a page at least 4 hours after the last 
>page view (this is debatable, but not the most important 
>aspect of my question).
>
>My question is, how do I select only those records from my 
>stats table that have a timestamp of at least 4 hours more 
>than the previous record for the client. I'm thinking 
>something along these lines...
>
>SELECT    COUNT(spcurr.UserID) AS UserCount, spcurr.Country
>FROM      Stats_Pages spcurr, Stats_Pages spprev
>WHERE     spcurr.UserID = spprev.UserID
>AND       MAX(spprev.Timestamp) < DATE_SUB(spcurr.Timestamp, 
>INTERVAL 4 
>HOUR)
>GROUP BY  sp.Country
>ORDER BY  UserCount DESC
>
>...but I know that the MAX() in the WHERE statement is not 
>quite right. 
>Any ideas on how this can be accomplished?
>
>TIA
>
>--
>Sarah Sweeney  ::  Web Developer & Programmer Portfolio  ::  
>http://sarah.designshift.com Blog  ::  http://hardedge.ca 
>Family  ::  http://geekjock.ca
>



More information about the thelist mailing list