[thelist] MySQL as CMS

Mike Migurski mike at saturn5.com
Wed Nov 6 15:16:00 CST 2002


>>Taxing the database might not be as bad a problem as you think - we've
>>built a similar system, and all the database query functions (as well as a
>>few others) have built-in caching mechanisms that store the results, in
>>php's serialized form, in temporary files.
>
>I have never heard of this before, and have worked with php/MySQL for
>the past year and a half. What is involved in caching database queries
>"in php's serialized form, in temporary files" ? Just a very brief
>outline would be helpful :)

A lot of blog-type CMS systems work by generating gobs of regular HTML
files every time the content is edited. Whenever a change is made,
archives are moved, updated, and synchronized. This is pretty much the
same thing, but it can be turned off and on at will, and the cached
versions are generated at request time, rather than edit time.

PHP has two tremendously useful, symmetrical functions:
http://www.php.net/manual/en/function.serialize.php
http://www.php.net/manual/en/function.unserialize.php

The first can take data, arrays, strings, numbers, whatever, and return
them in a string representation. The second takes these string
representations, and returns the original data.

So, in a database accessor method, you might do something like the
following:

1. Read the text of the SQL query

2. Determine whether it has been requested previously, and whether a
   cached version exists.

   a. If No:

      i. query the database and get the results as you normally would
      ii. serialize those results, and save them into a temporary file
      iii. return them

   b. If Yes:

      i. read the cached results from the appropriate temporary file
      ii. unserialize and return them.

This works wonderfully for complex or otherwise taxing SQL queries, since
all the logic that generates the result is skipped if an identical request
has been made. It works even better when you are doing the caching at a
higher level, and caching the results of entire HTML chunks by using
output buffering. <http://www.php.net/manual/en/ref.outcontrol.php>

Some caveats:

You need a way to uniquely map queries to temporary filenames

You need to pay attention to possible "trick" queries, like date-dependent
requests whose results might be different at different times, even though
the text of the query is the same.

You need a way to determine what cached data is stale, and allow your CMS
to flush that data when the corresponding tables have been changed.

This method is most beneficial for sites with a high request/edit ratio,
since there is a bit of file i/o overhead if you are writing new data to
the cache on a lot of requests. For sites that get hit hard, our
performance tests have shown that the differences between sites with
caching turned on and those with caching turned off are dramatic.

---------------------------------------------------------------------
michal migurski- contact info and pgp key:
                 http://www.saturn5.com/mike/contact.html

                "Freedom! Horrible, horrible freedom!"








More information about the thelist mailing list