[thelist] SQL SELECT speed-up question

Phil Turmel pturmel-webdev at turmel.org
Fri Dec 7 21:53:48 CST 2007

Jay Turley wrote:
> Hi!
> I am trying to speed up some sql we use to select contacts from our internal
> contact manager.
> And I've been beating my head against the wall for days now.
> I have tried using temporary tables, derived tables, UNIONs, and a bunch of
> other things that didn't work; nothing will speed it up more than a few
> seconds. I am not an expert in SQL, so I feel there must be something I am
> missing about how to approach this. Here is sample code using derived tables
> and IN:
> select
>     c.contact_id
>     ,c.fname
>     ,c.lname
>     ,c.do_not_contact
>     ,ct.contact_type
>     ,c.team_sharing
> from
>     econtacts c
> inner join
>     (
>     select
>         i.contact_id
>     from
>         einv_interests i
>     where
>         i.inv_type_id in
> (10,61,60,55,56,57,58,59,50,51,52,41,42,44,43,45,46,47,48,49)
>     and
>         i.inv_range_id in (3,4)
>     ) ci
> on
>     c.contact_id = ci.contact_id
> inner join
>     econtact_type ct
> on
>     c.contact_type_id = ct.contact_type_id
> The derived table "ci" in the query above may have as many as 40 integers in
> the sets.
> The table structure for econtacts is pretty much as shown above in the
> select portion. The einv_interests table structure is as follows:
> CREATE TABLE [einv_interests] (
>     [contact_id] [int] NOT NULL ,
>     [inv_type_id] [int] NOT NULL ,
>     [inv_range_id] [int] NULL ,
>     [inv_cat_id] [int] NULL ,
>     (
>         [contact_id]
>     ) REFERENCES [econtacts] (
>         [contact_id]
>     ),
>     (
>         [inv_cat_id]
>     ) REFERENCES [einv_cat] (
>         [inv_cat_id]
>     ),
>     (
>         [inv_range_id]
>     ) REFERENCES [einv_range] (
>         [inv_range_id]
>     ),
>     (
>         [inv_type_id]
>     ) REFERENCES [einv_type] (
>         [inv_type_id]
>     )
> GO
> where einv_cat, einv_range, and einv_type are simple lookup tables. There is
> no primary key defined on the table, which may very well be the problem.
> This is because two of the columns are nullable, and so I can't define a
> primary key over all 4 columns. Business logic guarantees each row to be
> unique, but I think that without the PK, I am ending up running table scans.
> :(
> Any help figuring out how to speed this us would be greatly appreciated.
> Thanks a ton!

Hi Jay,

I haven't seen a resolution for this, so I thought I'd kick in a few ideas.

First, I noticed that there is not output column from the 'ci' derived
table.  Do you really want duplicate rows if there are multiple matches
in the interests inventory?  If you only want one output row per
contact, I'd convert that derived table into a where clause like so:

SELECT c.contact_id, c.fname, c.lname, c.do_not_contact,
    ct.contact_type, c.team_sharing
FROM econtacts c INNER JOIN econtact_type ct
ON c.contact_type_id = ct.contact_type_id
WHERE c.contact_id IN (
    SELECT contact_id
    FROM einv_interests i
    WHERE i.inv_type_id IN
    AND i.inv_range_id IN (3,4)

Second, I don't think even a primary key on einv_interests would help,
as you are not picking out specific rows by value.  What should help,
though, are non-unique indices on each column in einv_interests.  You
don't say, but I'm assuming you do have indices on econtacts.contact_id,
econtacts.contact_type_id, and econtact_type.contact_type_id.

I'm a little weak on SQL Server, but you might also benefit from
two-column non-unique indices on (inv_type_id, contact_id) and
(inv_range_id, contact_id).

Finally, none of the possible optimizations will occur if the query
optimizer has no statistics to work with.  You'll need to use 'UPDATE
STATISTICS' on these tables to do that.  If you are still having
trouble, it might not hurt to post the execution plan and/or execution
profile that SQL Server generates for your query.



[1] http://technet.microsoft.com/en-us/library/aa964133.aspx

More information about the thelist mailing list