[thelist] Local Storage of RSS

Seth Fitzsimmons seth at note.amherst.edu
Wed Mar 19 16:06:23 CST 2003


>> How would you store large numbers (500,000+) of RSS documents for 
>> periodic local parsing or serving?
> 
> 
>> Option 1: keep them as individual files, preferably in an organized 
> 
>             directory tree.
> 
>> Option 2: As BLOBs in a relational database (despite the content of 
> 
>             the RSS files not being relational).
> 
>> Option 3: In an XML database, e.g. Xindice. 
> 
> 
> Wouldn't Option 4 be to store in an RDBMS as individual fields?
> like: channelID, itemId, title, link, description

That's essentially the method being used at the moment.

The driving reason for keeping them static is to allow them to be 
"syndicated" (quoted because it's not traditional RSS syndication) 
without large numbers of database queries.

My idea is that generating the RSS files when the content is initially 
created (as well as creating RDMS entries) will save time over the long run.

> Possibly using a separate table for channel attributes (channelID,
> title, language, etc.).
> 
> I'd think the RDBMS overhead would be less than going through that
> much content with an XML parser, but that's a guess not based on
> experience with that volume of data :-)

Both would exist, so bits that needed parsing would be in the RDBMS and 
the bits that are exported would already be in XML (leaving the 
processing to be done on the remote end).

seth



More information about the thelist mailing list