[thelist] XML vs HTML for Content Management

martin.p.burns at uk.pwcglobal.com martin.p.burns at uk.pwcglobal.com
Thu Nov 30 05:33:24 CST 2000


Memo from Martin P Burns of PricewaterhouseCoopers

-------------------- Start of message text --------------------

Norman

Apologies for the delay in answering, but here's my take on your
situation.

In many ways, a CMS which writes out to static files is a viable solution.
You certainly get performance benefits, as an application server doesn't
have to parse each request, open a db connection, build a page, serve
it and then tear down the db connection at the end.

In fact, the home page of the evolt site is built statically (we've used both
a cron job and CFSCHEDULE at various times) from a scheduled db
call, rather than from each user request. That's why sometimes the
very newest of articles don't show up (moving from a scheduled build,
to a build every time the home page changes is on the to-do list).

Many of the large, robust CMS tools will allow just this methodology -
Interwoven Teamsite in fact shouts about it to the rooftops as one option
(others are HTML with SSI, and interfacing to a range of app servers)

What Teamsite does is it that it uses XML through workflow, and only
renders to HTML right at the end of the process - effectively using
an XML content store

On the other hand, there are many applications where you might *want*
your site to build each page on demand, particularly where there is
interaction between a user session and the site (say where you have
personalisation happening, or on the fly UA tuning).

If this is the case, then yes, there are performance implications, and
you'll need to perform load/volume tests to ensure that your site
can handle the predicted traffic (which will include some headroom
for growth). For some sites, this may be as high as 1500 simultaneous
users, and the kit required to handle that is fearsome. If you remember,
http://www.if.com/ postponed their web launch 5 months ago because
they didn't sign off on the volume test (they're rumoured to be finally
launching this week).

Finally, the issue to be aware of if you have a file-based CMS is
one of file-locking. A sensible RDBMS will handle your concurrency
issues for you, but life isn't so simple if you're really writing to real
files.

There's more on this in Phil Greenspun's book.

Cheers
Martin



Please respond to thelist at lists.evolt.org
To:   thelist at lists.evolt.org
cc:


Subject:  [thelist] XML vs HTML for Content Management



Hi all

I've got a question which has I've been pondering a lot recently.  I've
built a site featuring a lightweight CMS This allows non-technical authors
to add/edit/delete features and news on the site without having to know
anything about HTML coding.  The system adds each new article to a database,
and at the same time writes a page of HTML to disk, so that any browser
hitting the site reads a static page.  If any editing is carried out on the
page it simply overwrites the existing HTML page.  The system also
automatically updates things like the front page and various archive pages.

Now I've been reading a lot about XML, how it's the future of website
management/production etc, and I've been wondering how useful it would be
for this (and similar sites).  I know most of the big name CMS are built
around XML, but I just can't see the advantage.

The main advantage sited for it is that it allows you to serve up different
flavours of a page according to the browser hitting the site.  For instance
a WAP phone gets a heavily cut down version, whilst an IE user gets the full
treatment.  However this requires server side processing.  First of all
sniffing the browser, then processing the XSL/XML to generate the
HTML/WML/whatever.  And this has to occur everytime that a browser hits the
page.

Using my system I can carry out this server side processing once, when the
article is generated, and instead of just spitting out one version of the
article I can produce a number of varients.  Then I browser sniff once, when
the user hits the site, and they get redirected into an appropriate
directory, containing static pages tailored to their browser type.

Am I somehow really flawed in my thinking?  I've got a data repository to
enable me to change the site appearance quickly (I just alter a template and
then process then the whole db) and browsers are served static pages
tailored to their client with the minimum of server side processing (and
I've always thought that the golden rule of web production was to get
content to the client as quickly as possible).  How could using XML, with
it's additional load serverside, possible improve my situation?


--------------------- End of message text --------------------

The principal place of business of PricewaterhouseCoopers and its associate
partnerships is 1 Embankment Place, London WC2N 6NN where lists of the
partners' names are available for inspection. All partners in the associate
partnerships are authorised to conduct business as agents of, and all
contracts for services to clients are with, PricewaterhouseCoopers. The UK
firm of PricewaterhouseCoopers is authorised by the Institute of Chartered
Accountants in England and Wales to carry on investment business.
PricewaterhouseCoopers is a member of the world-wide
PricewaterhouseCoopers organisation.
----------------------------------------------------------------
The information transmitted is intended only for the person or entity to which
it is addressed and may contain confidential and/or privileged material.  Any
review, retransmission, dissemination or other use of, or taking of any action
in reliance upon, this information by persons or entities other than the
intended recipient is prohibited.   If you received this in error, please
contact the sender and delete the material from any computer.






More information about the thelist mailing list