Bill Moseley moseley at hank.org
Fri Feb 9 11:42:29 CST 2007

I'm currently using hypermail on an email list to archive it.
It works ok, but it does have some problems with generating invalid

Anyone aware of any (preferably Open Source) tools I could use to
parse and create a threaded archive?  It's a small collection of email
messages -- about 10,000.  I'd rather just have the data instead of it
generating html pages so that I can use my existing site tools for
formatting the output.

Frankly, I wouldn't mind writing it from scratch and using MySQL to
store the data.  There's plenty of tools for parsing email and
extracting out the content.

But, it would take me a while to figure out the threaded database (do
I use a "nested set" or "adjacency list"?, etc.) -- especially since
not all mail clients (and users) deal with threaded mail always
correctly.  For example, I notice on this list it's not uncommon for
people to reply to an existing message when starting a new thread.
I suspect it's not easy to detect that.

Bill Moseley
moseley at hank.org

