[thelist] Scanning for many strings in many texts
Hassan Schroeder
hassan at webtuitive.com
Thu Oct 13 14:54:55 CDT 2005
manuel.gonzalez.noriega at gmail.com wrote:
> So, every night you scan the N news stories of the day, seeking
> documents that match any of the M predefined 'alerts', That's
> basically the problem.
I don't know if there's anything published about the optimizations,
but Verity's Topic was originally created to address exactly that
kind of issue for the CIA/NSA -- so we're talking about a *fairly*
high volume of incoming data :-)
> Now I'm inclined to think that there's no way around it but doing M
> fulltext searches :-)
The above aside, I'm inclined to believe you're right.
Practically speaking, it may just be a matter of throwing hardware
at it.
Good luck!
--
Hassan Schroeder ----------------------------- hassan at webtuitive.com
Webtuitive Design === (+1) 408-938-0567 === http://webtuitive.com
dream. code.
More information about the thelist
mailing list