[thelist] Survey database design for storing responses to multiple-choice questions

rudy r937 at interlog.com
Mon May 20 16:19:01 CDT 2002


>  tblResponses
>    question_id (pk, fk)
>    respondent_id (pk, fk)
>    response_text (pk)

hi ben

nice design

michael already answered your problem -- don't pk that text, my friend

i thought perhaps i'd expound a bit on the role of the pk

<tip type="database design">
the only reason for having a pk is so that the database can apply
relational integrity rules on any foreign key that references that pk
</tip>

since you don't have another table referencing tblResponses, you don't
actually need any pk!

i could digress at this point and say that if you're using mysql, you
needn't bother with pk's and fk's at all, as it doesn't support them, but i
won't, because you may want a uniqueness constraint

you probably want to declare a uniqueness constraint on question_id and
respondent_id taken together -- this is so as to prevent the same person
from responding to the same question more than once

when you declare a pk in most databases (including mysql), the database
implements this with a unique index, and it is this happenstance that leads
most people to believe that every table needs a pk -- in reality, it does
no such thing, and a table can quite happily live without a pk, and in many
cases, without a uniqueness constraint as well

i mean, stop and think about it -- if you declare a field as an autonumber,
why would you want to go to the trouble of declaring a uniqueness
constraint on it?  it's *gotta* be unique, or else your database is b0rken!

now, whether you need an index for performance reasons is an entirely
separate question

(leave aside for a moment the fact that most databases employ an index as
the most convenient way to implement a uniqueness constraint)

if you reference a particular field in a WHERE clause, chances are the
query will perform poorly without an index on that field

without an index, the database has no choice but to scan through every row
in the table

with an index, the WHERE clause can be resolved without accessing the
table, and then only those rows that qualify are retrieved from disk

indexing is not a big deal with small tables (a couple hundred rows),
because after the first read, all the rows of the entire table will
probably just sit in the database's buffer area

but if you're dealing with thousands or millions of rows, indexes are
important

so,

<tip type="database design">
declare indexes later, as part of physical implementation strategy, after
the logical design of the database has been completed (and tested)
</tip>

okay, to come back to primary keys for a second

most databases implement primary keys as unique indexes

take advantage of this

declare a pk for every table

forget about declaring foreign keys, unless your database implements
relational integrity


rudy





More information about the thelist mailing list