[thelist] pretty databases vs. fast databases

rudy r937 at interlog.com
Sun Mar 9 23:08:01 CST 2003


hi emma

> left join cats_article on cats_article.article = meta_articles.id <--
> allows multilingual content or content sorted by language
> (one "article" can go into multiple categories/forums)

be careful here

when you join a "one" table, like the article table, with a "many" table,
like article categories, the resulting rows will have the "one"
columns repeated

this is obvious, right?  it is the way a one-to-many query works

however, it is disastrous when the "one" columns are large, as
they would be with an article body, typically a large varchar
or nvarchar or text or blob or what have you

you really want to get the body only once per article

this is a classic example of when it's better to split a query

on the other hand, splitting imposes an overhead that may not
be warranted by the actual data -- what if only one article in
a hundred actually belongs to more than one category, while
the rest all belong to only one?  then a single query is better,
and you live with the idea of transmitting the body from the
database to the frontend program multiple times, because it
happens to a rare minority of articles

> Here's my question: how do people feel about "wrecking"
> database normalization and sticking (e.g.) author names
> into the meta_comments table in addition to the author ID.

well, i looked over your query, and it appears that this is already
where the author name is coming from

and no, i don't like "wrecking" normalization, as a general rule

> It would reduce the query by one join.

in general, this is not something you worry about in the design phase
of a project, but rather in the load testing phase

i'm being totally serious

databases are designed to fulfill an information requirement

design the relationships properly, and don't "wreck" any until you are
forced to

this will only be revealed in volume testing


now, the advice i gave, about splitting a query into two, that has nothing
to do with normalization, and, in general, those types of things *are*
important to anticipate in the design phase of a project


helps?

rudy



More information about the thelist mailing list