On 5/9/05, Jonathan Dillon <jdillon at boehm-ritter.com> wrote: > >On 5/9/05, Jonathan Dillon <jdillon at boehm-ritter.com> wrote: > >> SELECT DISTINCT(email), firstname, lastname, phone FROM tablename; > >> > >> This is rather retarded, but why would this pull up duplicate emails? > >> I just don't understand. I thought DISTINCT would explicitly be > >> distinct to the column name. > > > >What is it to do with the following data? > > > >joe at blow.com Joe Blow 555-555-5555 > >joe at blow.com Joe Blow 555-555-5556 > > > >What row is it supposed to return? > > > >The reality is that every unique *combination* of fields listed after the > DISTINCT keyword will be returned. > > Matt: > > Doh! > > Yea, that's exactly what the data is doing. So, would changing the order > help? > > SELECT firstname, lastname, phone, DISTINCT(email) FROM tablename; Did you try it? I suspect this does not work (you may even get a syntax error). The database still will not know what to do with "duplicate" rows. > How do you get around this seemingly simple problem? I've got about 15% too > many records! I need records distinctly keyed off their emails... ;-) Then you should clean out the bad data and put a unique constraint on the email column so this can't happen in the future. Even if you were to find a solution that gives you unique emails, your data is going to be bad (see example I provided and assume that one of those phone numbers is not accurate). -- Matt Warden Miami University Oxford, OH, USA http://mattwarden.com This email proudly and graciously contributes to entropy.