[thelist] Very Large MySQL Table
Hans Zaunere
lists at zaunere.com
Tue Feb 10 08:44:12 CST 2009
> You may recall that I recently asked about running searches against
> domain names. I have now gotten a .com zone file from Verisign, which
> is basically a list of all registered .com domain names. I imported it
> into my local MySQL. There are ~ 185,000,000 domain names and it's too
> much for my workstation to run searches on. Just running count(*)
> takes 10 minutes. It's 6G of data.
If you're using a MyISAM as your storage engine, the count(*) should return
instantly. If you're using InnoDB, then you'll need more RAM and
horsepower. You probably should be using MyISAM in this case.
> We want to run searches basically to find domain names with certain
> words or phrases.
>
> We don't have a dedicated server for this but his web account does
> have disk space for 6Gs, but I am wondering if we will be able to run
> searches. His idea was to break it up into 8 pieces, and then to run
> each search 8 times, against just that one eighth of the full list.
If you're dealing with 6gb tables, a web hosting account isn't going to give
you very usable environment. I'd recommend getting some modern dedicated
server or even VPS with a nice allotment of RAM. Don't jump through hopes
with fancy partitioning schemes to overcome a lack of hardware - end of the
day, you will always have 185 million rows to search through.
>From the database side, make sure the columns that you're searching on are
indexed correctly. If there are multiple columns being searched, they must
be indexed in searched order left-to-right in a single index. MySQL can't
use more than one index per table and can't do random seeks within an index
(check the docs about this - it goes into quite a bit of detail).
---
Hans Zaunere / Managing Member / New York PHP
www.nyphp.org / www.nyphp.com
More information about the thelist
mailing list