[thelist] Scanning for many strings in many texts
Judah McAuley
judah at wiredotter.com
Thu Oct 13 12:33:56 CDT 2005
Hassan Schroeder wrote:
<snip>
> Exactly. Real search engines create *indexes* of the content in a
> document repository. That happens only when content is added to the
> repository, *not* each time someone wants an article referencing
> "Joe Montana".
>
> There are dedicated enterprise-level (and -price!) solutions from
> various vendors (Verity, Autonomy, etc.). MySQL offers a full-text
> search capability at a more popular pricepoint :-)
Full text indicies (offered by MySQL and MSSQL to name two I've used)
are quite good options, especially if the info is already stored in a
database. If the texts currently reside in seperate files you might want
to investigate a search engine like HtDig to index the content and then
you can just query HtDig and retrieve the file(s) you need.
Judah
More information about the thelist
mailing list