[thelist] Truth about Outer Joins vs. Union All in Oracle?
Matt Warden
mwarden at gmail.com
Fri Nov 16 09:59:43 CST 2007
listers:
There are rumors that outer joins should be avoided in Oracle in favor
of a semantically equivalent union all. I have sort of taken this as
truth, but the more I think about it the more I am not convinced. I
have done some googling, but have not found what I'm looking for.
The following are two versions of the same query, one with a left
outer join and the other using a union all. After each query is the
execution plan chosen by the cost-based optimizer.
Could someone with better experience reading execution plans confirm
that the union all should perform better in general (and an
explanation why, if possible)?
Probably want to view this in fixed font...
select nbr_ssn
from data_exch.t_exch_data_header d
left join data_exch.t_exch_match_header m
on d.idn_header_data_exch = m.idn_header_data_exch;
Operation Name Rows Bytes Cost
CPU Cost IO Cost
SELECT STATEMENT 35M 640M 113506
16G 112293
HASH JOIN RIGHT OUTER 35M 640M 113506
16G 112293
INDEX FAST FULL SCAN PK_T_EXCH_MATCH_HEADER 1752K 10M 1722
251M 1703
PARTITION HASH ALL 35M 438M 61548
8G 60932
TABLE ACCESS FULL T_EXCH_DATA_HEADER 35M 438M 61548
8G 60932
select nbr_ssn
from data_exch.t_exch_data_header d
inner join data_exch.t_exch_match_header m
on d.idn_header_data_exch = m.idn_header_data_exch
union all
select nbr_ssn
from data_exch.t_exch_data_header d
where not exists (
select *
from data_exch.t_exch_match_header m
where d.idn_header_data_exch = m.idn_header_data_exch
);
Operation Name Rows Bytes
Cost CPU Cost IO Cost
SELECT STATEMENT 35M 641M 227011
16G 112293
UNION-ALL
HASH JOIN 1752K 31M 113506
16G 112293
INDEX FAST FULL SCAN PK_T_EXCH_MATCH_HEAD 1752K 10M 1722
251M 1703
PARTITION HASH ALL 35M 438M 61548
8G 60932
TABLE ACCESS FULL T_EXCH_DATA_HEADER 35M 438M 61548
8G 60932
HASH JOIN RIGHT ANTI 33M 609M 113506
16G 112293
INDEX FAST FULL SCAN PK_T_EXCH_MATCH_HEAD 1752K 10M 1722
251M 1703
PARTITION HASH ALL 35M 438M 61548
8G 60932
TABLE ACCESS FULL T_EXCH_DATA_HEADER 35M 438M 61548
8G 60932
All I can really take out of this is that the I/O cost of the second
is about twice the first (I assume because of the second table scan of
the table with > 30 million records in it!). But I'm sure I'm only
considering one small piece of the puzzle. Any thoughts?
Is this a special case where the outer join is better? What about the
general case?
Thanks!
--
Matt Warden
Cincinnati, OH, USA
http://mattwarden.com
This email proudly and graciously contributes to entropy.
More information about the thelist
mailing list