[thelist] SQL select help

Fri Jul 27 10:21:46 CDT 2001

> The problem with this method is that it relies on the identity field
> being sequential - something you can't always be certain of.

hi ed

good point

you're right, you can't be certain of sequentialness (sequentiality?
sequentialhood?) -- until you look it up and see how it was defined, how
the application has been populating it, the pattern of deletions, etc.

but the query does not rely on sequential numbers, heck the column being
randomized doesn't even have to be an identity field

(aside: i'm trying to remember where i thought i read that sql/server
allows identity fields to be created randomly rather than incrementally --
anybody know?  perhaps it was some other database...)

anyhow there are all kinds of twists to the randomness problem, for
example, if you have 100 records with a key between 1 and 100 and then 100
more records with a key between 5000 and 5100, and then try to select a
random row somewhere between 1 and 5100, you are going to land on record
number 5000 (or 100, depending on how you code the query) an awful lot --
so it's not truly random

it is an even distribution of keys that the rand() query relies on, not
sequential

that said, i really like your solution of the stored procedure

very elegant, probably quite good performing, but i'd like to offer one
small improvement --

> An extension to this idea would be to create a stored procedure that
> creates a temporary table of the result set, that appends its own
> sequential ID and uses that to do the random sort.

make the temporary table contain just the keys of the result set (which are
typically available in an index) rather than the entire result set -- then
get the full row by joining to the original table after selecting the key
randomly

rudy