[thelist] Simple table question...
Matt Warden
mwarden at gmail.com
Mon May 9 19:20:23 CDT 2005
On 5/9/05, Jonathan Dillon <jdillon at boehm-ritter.com> wrote:
> >On 5/9/05, Jonathan Dillon <jdillon at boehm-ritter.com> wrote:
> >> SELECT DISTINCT(email), firstname, lastname, phone FROM tablename;
> >>
> >> This is rather retarded, but why would this pull up duplicate emails?
> >> I just don't understand. I thought DISTINCT would explicitly be
> >> distinct to the column name.
> >
> >What is it to do with the following data?
> >
> >joe at blow.com Joe Blow 555-555-5555
> >joe at blow.com Joe Blow 555-555-5556
> >
> >What row is it supposed to return?
> >
> >The reality is that every unique *combination* of fields listed after the
> DISTINCT keyword will be returned.
>
> Matt:
>
> Doh!
>
> Yea, that's exactly what the data is doing. So, would changing the order
> help?
>
> SELECT firstname, lastname, phone, DISTINCT(email) FROM tablename;
Did you try it? I suspect this does not work (you may even get a
syntax error). The database still will not know what to do with
"duplicate" rows.
> How do you get around this seemingly simple problem? I've got about 15% too
> many records! I need records distinctly keyed off their emails... ;-)
Then you should clean out the bad data and put a unique constraint on
the email column so this can't happen in the future. Even if you were
to find a solution that gives you unique emails, your data is going to
be bad (see example I provided and assume that one of those phone
numbers is not accurate).
--
Matt Warden
Miami University
Oxford, OH, USA
http://mattwarden.com
This email proudly and graciously contributes to entropy.
More information about the thelist
mailing list