[thelist] Free Tip about SQL

Fri Jul 18 17:37:21 CDT 2003

> > SELECT id
> > FROM users
> > WHERE username = 'chris'
> > AND coolfactor = 'infinite'
> > AND babemagnet = 'yes';
>
> Actually, this isn't guaranteed to work at all, simply because you're
> assuming that your record is unique. In a talbe with a large number of
> fields, you really don't want to be doing a where clause on every field.

seb, old pal, allow me to spout some theory -- as if you need it, eh

but i need to say this in case there are any other people out there unsure
of the purpose of the id field and perhaps tempted to use max(id) to get the
last one inserted

a numeric id like an auto_increment or identity field or number drawn from a
sequence -- nice post, by the way, SBeam -- is usually called a surrogate
key, for the reason that it stands in place of the "real" primary key

if the table does not have a "real" primary key (sometimes called a
candidate key) then it isn't a relational table, it's a flat file, and youse
mought as well use an odbc interface to an actual flat file if that's how
you're gonna utilize the power of relational databases

i hang out on a lot of database forums, have posted several thousand answers
to sql questions, and have read countless more thousands of posts

"how do i delete duplicate rows" has gotta be in the top three most
frequently asked questions of all time

would never happen if people weren't coddled by these infernal surrogate
keys

just because you have an autonumber in a table does not solve all your
problems

let's say you have a table of products for sale

   id   integer  primary key
   name   varchar(20)
   vendorid   integer  foreign key references vendor (id)
   price   numeric(11,2)

even though you have declared a primary key, if you do nothing else, then
sooner or later you will be posting here or on some database forum "i have
two identical products in my table, how do i delete one of them?"

so, what to do?

go ahead and use a surrogate key -- i'm not against them, in fact i use them
myself most of the time -- but please be conscious that there lurks a "real"
key somewhere in the table

(in the above example, i'd venture it is name plus vendorid)

okay, so one of the functions of a primary key is to uniquely identify a row

i think you can see where this is going...

IF -- notice the emphasis -- your database does not support something like
@@identity or last_insert_id, then use the following method

when you insert a row into a table, hang on to the value(s) of the "real"
primary key, then use those to query back the id

voila

oh, and by the way, if you declare a unique constraint on the "real" primary
key, you'll never be posting that dreaded "how do i delete duplicates"
question

anyway, coming back to chris' suggestion --

> > SELECT id
> > FROM users
> > WHERE username = 'chris'
> > AND coolfactor = 'infinite'
> > AND babemagnet = 'yes';

this is exactly the method i was talking about, but a small refinement is
necessary

seb's objection -- "you really don't want to be doing a where clause on
every field" -- is absolutely spot on

you only have to do it on the "real" primary key, which in chris' example is
probably just the username -- unless, chris, you were thinking of supporting
the idea that two entries in the users table can have the same username
with different coolfactors or babemagnet attributes...

now i know that it is possible to have a table where the rows are not
unique, but as i said, that's not a real relational table, and should be
attempted only after you have mastered primary keys

do a google for fabian pascal or dbdebunk if this sort of thing interests
you

rudy