[thelist] Very Large MySQL Table

Matt Warden mwarden at gmail.com
Sat Mar 7 17:35:14 CST 2009


On Sat, Mar 7, 2009 at 6:29 PM, Matt Warden <mwarden at gmail.com> wrote:
> original table. *IF* your search words are always English words, then
> you could significantly reduce the count of your substrings by
> removing any that are not found in a set of English dictionary words.
> This would probably knock out 75% of your rows and leave you with
> essentially an index of all English words mapped to your domain names,
> which is your ideal situation.
>
> Quite a bit of work, and depending on your requirements, may not even
> be adequate. But just some ideas to mull over...

If you go with this last option, I should mention you first have to
generate all permutations, not simply each successive substring. So
rather than simply: domain, omain, main, in; you would need to
generate: domain, omain, main, in, domai, doma, dom, do, omai, oma,
om, mai, ma. Then only could you cross reference with an English
dictionary and eliminate rows not found.


-- 
Matt Warden
Cincinnati, OH, USA
http://mattwarden.com


This email proudly and graciously contributes to entropy.



More information about the thelist mailing list