[thelist] Storing serialized/XML data into the database.

Shashank Tripathi shanx at shanx.com
Thu May 31 01:35:29 CDT 2001


Hi people,

The captioned topic is troubling me. I know that serialization of data
is/was a hot issue and I was recently presented by a suggestion from one of
my colleagues that we could use a similar approach for storing all our
content in the database.

To start clearly, the idea is that our database will have ONE table for
storing all kinds of content, with the following items:

    content_id
    serialized_value (could contain something similar to serialized value,
e.g. a whole XML file)
    modify_user
    modify_date

Any kind of content, regardless of whether it is product info or news or
some advertising brochureware material etc. can be stored into this ONE
table in the database.

It sounded like an elegant and simple idea at first, but soon the discomfort
I had became more real. For instance, since all your data is just stored
like that - as a lump - you basically need to do all processing, everything,
at run time in your application to figure out what to do with the data. To
present some of my problems:

1. How to sort data? There are no fields/columns to order by - just lumps of
text in your resultset. So the steps are - we need to go to the database,
get all records, unserialize all the SERIALIZED_VALUE columns, then do the
sorting at the app level (maybe in an array?), and then display the values
of the array! Doesnt sound like a reasonable way to do things if the
application is going to be used by even 1000 concurrent users. Yes, we could
do some kind of caching, but what if the data is fairly frequently updated?

2. How to maintain relationships? Almost all your data is a lump. How do
you, for instance relate a NEWS article to a product, which you can easily
do with a more traditional relational design (by creating a simple linkage
table)?

3. What about the performance hits of  massaging all the records, every
time?


I would like to hear experiences from people who have actually used
serialization or storage of a whole XML thingie in the database like the
above example.

Correct me if I am wrong, but it sounds like this approach might be great
for archiving and syndication of content because it breaks the dependence of
the content on the content_type, but it is NOT a good idea for storing all
the usual information that is needed every hour, every minute by your
application.

Would also love to have any references/links where I can find a similar
discussion and its pros/cons.

Thanks in advance,
Shanx


--
Ships are safe inside the harbour
but is that what ships are really for?

Shashank Tripathi
www.shanx.com







More information about the thelist mailing list