[thelist] HTML Output Compression

J.J. SOLARI jjsolari at pobox.com
Sun Mar 6 10:03:28 CST 2005


Hello all,

I have a question dealing with serving compressed (gzip) HTML
output. The site is mostly about providing translations of some W3C
Recommendations (see <http://www.yoyodesign.org/doc/w3c/> for an
idea)

These are generally heavy weight documents, size of some nearing
1Mo, and so are good candidates regarding compression.

Unfortunately, current provider offers basic hosting services - ie
almost no (or very limited) modifications via htaccess file -,
which is a good reason to look elsewhere for better hosting. The
one I am considering now allows (tentatively) a great deal of
customization, in particular with respect to Apache htaccess
directives.

Compression is only available through PHP (module mod_gzip is
surprisingly not proposed).

Borrowing from the Web, here is what I came to. Let site structure
be:

/base_dir
/base_dir/translation1/
/base_dir/translation2/
.etc

A htaccess file (/base_dir/.htaccess) which contains:

Options +Indexes
AddHandler compress .html
Action compress /base_dir/gzip.php


With associated PHP script (/base_dir/gzip.php):

<?php
ob_start( 'ob_gzhandler' );
$file = basename( $_SERVER['REQUEST_URI'] );
readfile( $file );
?>

A complete URL must be provided, for example
<http://www.example.org/base_dir/index.html>, or
<http://www.example.org/base_dir/translation1/some_file.html>;
otherwise, if you have <http://www.example.org/base_dir/>, a
warning is displayed such as:

readfile(test): failed to open stream: No such file or directory in
/realpath_to/base_dir/gzip.php on line 4


Question1: what can be done in htaccess to account for these later
URLs?

Getting greedy now :-)
I would like to keep this transparent compression scheme, that is there is no
need to incorporate any code in HTML pages for compression to occur, and also
to optimize performance somehow. From php.net site, compression defaults to 6
on a scale ranging from 0 to 9, and according to users, optimal value would lie
between 1 and 6, considering bandwidth/CPU usage ratio.

Question2: Any suggestion to implement this feature?


Thanks,

JJS.


More information about the thelist mailing list