[thelist] Very Large MySQL Table

Matt Warden mwarden at gmail.com
Sat Mar 7 17:35:14 CST 2009

On Sat, Mar 7, 2009 at 6:29 PM, Matt Warden <mwarden at gmail.com> wrote:
> original table. *IF* your search words are always English words, then
> you could significantly reduce the count of your substrings by
> removing any that are not found in a set of English dictionary words.
> This would probably knock out 75% of your rows and leave you with
> essentially an index of all English words mapped to your domain names,
> which is your ideal situation.
> Quite a bit of work, and depending on your requirements, may not even
> be adequate. But just some ideas to mull over...

If you go with this last option, I should mention you first have to
generate all permutations, not simply each successive substring. So
rather than simply: domain, omain, main, in; you would need to
generate: domain, omain, main, in, domai, doma, dom, do, omai, oma,
om, mai, ma. Then only could you cross reference with an English
dictionary and eliminate rows not found.

Matt Warden
Cincinnati, OH, USA

This email proudly and graciously contributes to entropy.

More information about the thelist mailing list