[thelist] sql/table design question
rudy
r937 at interlog.com
Wed Feb 19 06:59:01 CST 2003
hi joshua, thanks for the examples
yeah, i've seen them before ;o)
> SELECT Book2 AS RelatedBook
> FROM table
> WHERE Book1 = 1234
>
> This is obviously going to run a lot faster assuming the same number
> of rows. Of course, we'll have double the rows, but I'd bet it's STILL
> faster.
gut feel isn't good enough
the situation calls for a benchmark against a real table
> I understand your concerns about consistency. Perhaps some combination
> of FK's would help ensure that the inconsistencies don't occur?
not really
i should hope you'd have FKs defined regardless, eh
no, the inconsistency occurs when one of the two rows fails to get inserted
initially, or is later deleted
like i said, you'd better have rock solid app logic
"how could i prevent a situation where one gets deleted but not the other?"
i can hear you thinking, and in a second you'll be suggesting a transaction
block (eewwww)
ah, i see jeff has an answer, similar to the one i would have suggested, had
i not been catching zeds at 4 in the morning
> SELECT DISTINCT relatedBook
> FROM (SELECT book2 AS relatedBook
> FROM table
> WHERE book1 = 1234
> UNION ALL
> SELECT book1 AS relatedBook
> FROM table
> WHERE book2 = 1234)
> ORDER BY relatedBook
>
> this has the advantage of not resulting in duplicate relatedBook records,
> should the data make that possible, which yours does not address
> (without adding DISTINCT in front of RelatedBook).
"should the data make that possible?"
well, yeah -- the double row scenario has duplicates by design!
1234 6789
6789 1234
a more straightforward way is to use UNION instead of UNION ALL and then the
outer select is not necessary
however, both UNION and DISTINCT require a sort of the entire result set, so
i imagine they would both perform the same way -- slower
if, on the other hand, there's only one row for each relationship, you can
use UNION ALL and avoid the sort --
select book2 as relatedBook
from relatedbooks
where book1 = 1234
union all
select book1
from relatedbooks
where book2 = 1234
this should run quite well, and note that while there are two passes of the
data where the OR syntax requires only one, each subquery in the above
UNION ALL can utilize a separate index, whereas the OR syntax would
likely utilize index matching -- if you're lucky
i would still like to see benchmark results, though
and why can you avoid the sort with UNION ALL? because in the single row
scenario, each pair is unique
the PK for the intersection table would be ( book1, book2 ), right?
so let's say that your app is in place, everything's running along smoothly,
and a user makes a relationship between book 1234 and book 6789 --
1234 6789
in the single-row scenario, what's to stop some other user, later, from
trying to relate book 6789 to book 1234? you might think that this would
result in the insertion of
6789 1234
how would you prevent this? the PK wouldn't catch it, right? would you
have to declare a separate UNIQUE constraint on the pair of keys in the
other order?
no, the answer is actually quite simple
the app always inserts the row with book1 < book2
;o)
More information about the thelist
mailing list