[thelist] Truth about Outer Joins vs. Union All in Oracle?
Phil Turmel
pturmel-webdev at turmel.org
Fri Nov 16 10:42:34 CST 2007
Matt Warden wrote:
> listers:
>
> There are rumors that outer joins should be avoided in Oracle in favor
> of a semantically equivalent union all. I have sort of taken this as
> truth, but the more I think about it the more I am not convinced. I
> have done some googling, but have not found what I'm looking for.
>
> The following are two versions of the same query, one with a left
> outer join and the other using a union all. After each query is the
> execution plan chosen by the cost-based optimizer.
>
> Could someone with better experience reading execution plans confirm
> that the union all should perform better in general (and an
> explanation why, if possible)?
>
> Probably want to view this in fixed font...
>
> select nbr_ssn
> from data_exch.t_exch_data_header d
> left join data_exch.t_exch_match_header m
> on d.idn_header_data_exch = m.idn_header_data_exch;
>
> Operation Name Rows Bytes Cost
> CPU Cost IO Cost
> SELECT STATEMENT 35M 640M 113506
> 16G 112293
> HASH JOIN RIGHT OUTER 35M 640M 113506
> 16G 112293
> INDEX FAST FULL SCAN PK_T_EXCH_MATCH_HEADER 1752K 10M 1722
> 251M 1703
> PARTITION HASH ALL 35M 438M 61548
> 8G 60932
> TABLE ACCESS FULL T_EXCH_DATA_HEADER 35M 438M 61548
> 8G 60932
>
>
> select nbr_ssn
> from data_exch.t_exch_data_header d
> inner join data_exch.t_exch_match_header m
> on d.idn_header_data_exch = m.idn_header_data_exch
> union all
> select nbr_ssn
> from data_exch.t_exch_data_header d
> where not exists (
> select *
> from data_exch.t_exch_match_header m
> where d.idn_header_data_exch = m.idn_header_data_exch
> );
>
> Operation Name Rows Bytes
> Cost CPU Cost IO Cost
> SELECT STATEMENT 35M 641M 227011
> 16G 112293
> UNION-ALL
> HASH JOIN 1752K 31M 113506
> 16G 112293
> INDEX FAST FULL SCAN PK_T_EXCH_MATCH_HEAD 1752K 10M 1722
> 251M 1703
> PARTITION HASH ALL 35M 438M 61548
> 8G 60932
> TABLE ACCESS FULL T_EXCH_DATA_HEADER 35M 438M 61548
> 8G 60932
> HASH JOIN RIGHT ANTI 33M 609M 113506
> 16G 112293
> INDEX FAST FULL SCAN PK_T_EXCH_MATCH_HEAD 1752K 10M 1722
> 251M 1703
> PARTITION HASH ALL 35M 438M 61548
> 8G 60932
> TABLE ACCESS FULL T_EXCH_DATA_HEADER 35M 438M 61548
> 8G 60932
>
>
> All I can really take out of this is that the I/O cost of the second
> is about twice the first (I assume because of the second table scan of
> the table with > 30 million records in it!). But I'm sure I'm only
> considering one small piece of the puzzle. Any thoughts?
>
> Is this a special case where the outer join is better? What about the
> general case?
>
> Thanks!
>
Hi Matt,
I don't see in the execution plan any indication that your join
column is indexed (idn_header_data_exch) in the data table
(idn_header_data_exch). (?!?!) Unless I've misread something,
that forces a full table scan in both halves of the union.
FWIW, I don't bother to convert outer joins to unions, ever
(yet). The loss of maintainability is just too expensive. If I
ever run into a tough case that absolutely needs this, I'd
arrange for the FK column to be NULL when appropriate, and scan
on that in the single-table half of the union.
Phil
--
Need to contact me offlist?
Drop -webdev or you probably won't get through.
More information about the thelist
mailing list