[thelist] Scanning for many strings in many texts

Judah McAuley judah at wiredotter.com
Thu Oct 13 12:33:56 CDT 2005


Hassan Schroeder wrote:
<snip>
> Exactly. Real search engines create *indexes* of the content in a
> document repository. That happens only when content is added to the
> repository, *not* each time someone wants an article referencing
> "Joe Montana".
> 
> There are dedicated enterprise-level (and -price!) solutions from
> various vendors (Verity, Autonomy, etc.). MySQL offers a full-text
> search capability at a more popular pricepoint :-)

Full text indicies (offered by MySQL and MSSQL to name two I've used) 
are quite good options, especially if the info is already stored in a 
database. If the texts currently reside in seperate files you might want 
to investigate a search engine like HtDig to index the content and then 
you can just query HtDig and retrieve the file(s) you need.

Judah



More information about the thelist mailing list