[thelist] Merge Duplicate Records in MySQL/PHP

Fred Jones fredthejonester at gmail.com
Fri Jul 4 01:47:20 CDT 2008


In MySQL, I have a table of some 30K contacts (people) with 2500
duplicate records (using email matching as the sole criterion). In
addition to this table (with its 20 fields) I have a dozen other
tables which have a foreign key to this table and store additional
data about each person, such as favorite food and hair color and
education etc. My job now is to find the duplicates and attempt to
merge the records, leaving one record in the DB and deleting the
duplicates.

The merge I want to do based on simple logic that if one row has a
value for field X and the others don't, then use that value. If two
values exist for field X, however, then do nothing and alert the
operator that he must manually process this.

My question is, is there an automated way I can do this, aside from
writing out all the field names and then comparing them? The only real
idea I have come up with thus far is to use

describe my_table;

to get a list of the fields in a table. So with that, I could get a
list of all the fields of each of the dozen external tables, and then
be able to loop through them all (in PHP) to see if multiple values
exist, and if not, combine all rows into one. Then I can store those
values back into the DB for one of the records and then delete the
others.

Anyone have any better ideas? :)

Thanks!



More information about the thelist mailing list