[thesite] Tip Harvester question (was: [***] Formatting tips )

Seth Bienek seth at sethbienek.com
Fri Mar 30 15:07:02 CST 2001


> as far as the harvesting logic is concerned, i dunno if scanning for
"\n<tip" is flexible enough
>
> why not just scan for "<tip"?

'cause this would snag all tips from quoted replies where the cheap bastards
did not trim their emails properly, or where they want to include the tip as
part of their reply.. Right?


Seth

------------------------------
Seth Bienek
Solutions Development Manager
Stonebridge Technologies, Inc.
972.455.7294 tel
972.404.9754 fax
ICQ #7673959
------------------------------

> -----Original Message-----
> From: thesite-admin at lists.evolt.org
> [mailto:thesite-admin at lists.evolt.org]On Behalf Of rudy
> Sent: Friday, March 30, 2001 2:43 PM
> To: thesite at lists.evolt.org
> Subject: Re: [thesite] Tip Harvester question (was: [***] Formatting
> tips )
>
>
> > Sometimes thinking "outside the box" is a challenge for me.
>
> i dunno, i think you've been doing fine so far
>
> as far as the harvesting logic is concerned, i dunno if scanning for
> "\n<tip" is flexible enough
>
> why not just scan for "<tip"?
>
> then look for the closing ">" and parse everything inside it as name/value
> pairs
>
> then scan ahead and find "</tip>" and everything in between is
> the tip body
>
> the name/value pairs get stored in the database under multiple categories
> or whatever(*)
>
> yes, this might result in duplications when tips are (re)posted
> in replies,
> etc.
>
> but so what?  look at how matt and michele have gone through tons of old
> articles cleaning them up
>
> besides, we are not taking advantage of a huge untapped labour
> force -- the
> authors themselves
>
>
> (*) special note -- please let me know when you want to start testing
> harvest
> code and i'll be sure to have a document ready within a day explaining the
> evolt tables and relationships
>
> oh, and i just remembered, you must extract the date and author's email id
> from the post (headers?)
>
> rudy
>
>
>
>
>
> _______________________________________________
> http://lists.evolt.org/thesitearchive/
> and new & improved kentucky fried old archives:
> http://lists.evolt.org/thesitearchive/old/
>
>






More information about the thesite mailing list