[thesite] recently on thelist

Seth Bienek seth at sethbienek.com
Mon Jan 15 14:40:31 CST 2001


Hey Dan,

> ATM i'm checking out MBOX2Database.cfm
>   in the /xml directory and have a couple questions :)
>
> would it be possible to walk through like 100 msgs at a time rather than
> try to parse the whole thing at once?

Yes, it's possible.  The only drawback I see is that we'd have to save the
loop count in an application variable. (You're talking about running it
every x minutes, right?)

> did you mention before that you were running into performance issues
> with this? if so, what were they(i think it had to do with the indexes i
> had on, these can be taken off durning the initial inserts since they
> slow the fuck outa inserts)

Yes, big time.  I was processing at a rate of about 8-10 messages/second on
my end.. When I copied the files up there and ran a couple of tests using
the database, it was running about 10 seconds PER message (basically taking
100x longer).  That's one reason I was thinking that parsing it into XML and
then tossing that into the DB might be better..  I dunno though, might work
just peachy with those indexes gone.
>
> on line 207 you write back a blank .mbox file right? whats up with
> that?(majorgumbo needs that file to work, although i've never tested out
> deleting it.. we can talk about this one)

After the .mbox parsing is complete, it's contents are appended to an
"archive" .mbox file in another directory and a new, empty .mbox file in the
"active" directory is created, so that the next time the template runs, only
the new messages are procesed.

>
> lemme know :)
>
> .djc.
>

Also, the DB functions should work the same as the original template.  I
think the functionality of the template could be greatly increased by making
a few minor changes in the db (per our thread a few weeks back with Adrian).

Lemme know if you want me to change that file to do "x" at a time..

Take Care,

Seth

------------------------------
Seth Bienek
Solutions Development Manager
Stonebridge Technologies, Inc.
972.455.7294 tel
972.404.9754 fax
------------------------------

> -----Original Message-----
> From: thesite-admin at lists.evolt.org
> [mailto:thesite-admin at lists.evolt.org]On Behalf Of Daniel J. Cody
> Sent: Monday, January 15, 2001 12:01 PM
> To: thesite at lists.evolt.org
> Subject: Re: [thesite] recently on thelist
>
>
> seth  -
>
> ATM i'm checking out MBOX2Database.cfm
>   in the /xml directory and have a couple questions :)
>
> would it be possible to walk through like 100 msgs at a time rather than
> try to parse the whole thing at once?
>
> did you mention before that you were running into performance issues
> with this? if so, what were they(i think it had to do with the indexes i
> had on, these can be taken off durning the initial inserts since they
> slow the fuck outa inserts)
>
> on line 207 you write back a blank .mbox file right? whats up with
> that?(majorgumbo needs that file to work, although i've never tested out
> deleting it.. we can talk about this one)
>
> lemme know :)
>
> .djc.
>
> Seth Bienek wrote:
>
> > Hey Dan,
> >
> >
> >
> > Gimme a holla if you need me to pitch in on this..
> >
> >
> >
> > My idea of moving the parsing routine to PL/SQL was a bomb.  Turns out I
> >
> > don't have the time/resources to learn PL/SQL right now..
> >
> >
> >
> > If you can make the XML file work, or massage that into the
> Oracle DB, I can
> >
> > write a lil' deal that will process the .mbox into xml
> incrementally, so the
> >
> > hit isn't so bad.
>
>
> _______________________________________________
> http://lists.evolt.org/thesitearchive/
> and new & improved kentucky fried old archives:
> http://lists.evolt.org/thesitearchive/old/
>





More information about the thesite mailing list