[thelist] MySQL Query problem: unique vals from two fields

Wed Nov 22 13:17:27 CST 2006

On 11/22/06, Pringle, Ron <RPringle at aurora-il.org> wrote:
> To close this out, turns out you don't need to use DISTINCT in the
> query. The default behavior of UNION in MySQL is to remove duplicate
> rows anyhow. So the final query now looks like:

To clarify, you are saying that for a unioned set A and B, where A
and/or B have duplicates within its own set, the union operation will
remove such duplicates? So, it must combine sets and then perform the
un-dupe over the entire set, rather than take set A and only append
from set B if the item in set B is not found in A.

Did you do any tests to make sure the DB does not use a different
algorithm when it knows the two sets have no dupes already?

-- 
Matt

Matt-

No, I didn't. My reading of it is that it combines all data from both
tables and then removes any duplicates. But I see what you're saying
regarding using distinct on the individual selects versus using distinct
on the union. My dataset in this case is so incredibly small that I
don't know that I could see any differences in the two methods, but it'd
be interesting to know for future reference. I don't know how you'd go
about testing it.

Here is the text from the MySQL site:

"The default behavior for UNION is that duplicate rows are removed from
the result. The optional DISTINCT keyword has no effect other than the
default because it also specifies duplicate-row removal. With the
optional ALL keyword, duplicate-row removal does not occur and the
result includes all matching rows from all the SELECT statements."

http://dev.mysql.com/doc/refman/5.0/en/union.html

Ron