[thelist] pretty databases vs. fast databases

Emma Jane Hogbin emmajane at xtrinsic.com
Mon Mar 10 00:52:00 CST 2003


rudy:

Thanks for your feedback! I omitted some very important details,
which I thought were obvious. There are no "text" columns pulled in that
query. In fact, there are no "text" columns in the entire set of tables
EXCEPT the content table! It's a crazy way of doing it, but it's working
really nicely...so far
Here are descriptions of the tables in question:

Anything that is an INT or TINYINT is a foreign key! (Unless of course
it's prefaced by num_)

mysql> describe meta_articles;
+------------------+------------+------+-----+---------------------+----------------+
| Field            | Type       | Null | Key | Default             | Extra          |
+------------------+------------+------+-----+---------------------+----------------+
| id               | int(11)    |      | PRI | NULL                | auto_increment |
| author           | int(11)    |      |     | 0                   |                |
| content          | int(11)    | YES  |     | NULL                |                |
| status           | tinyint(4) | YES  |     | NULL                |                |
| created          | datetime   |      |     | 0000-00-00 00:00:00 |                |
| modified         | datetime   | YES  |     | NULL                |                |
| comments_allowed | tinyint(4) | YES  |     | NULL                |                |
| num_replies      | int(11)    | YES  |     | NULL                |                |
| last_comment_id  | int(11)    | YES  |     | NULL                |                |
| num_displayed    | tinyint(4) | YES  |     | NULL                |                |
| views            | int(11)    | YES  |     | NULL                |                |
| edits            | tinyint(4) | YES  |     | NULL                |                |
+------------------+------------+------+-----+---------------------+----------------+
12 rows in set (0.00 sec)

NOTE: author is NOT the name of an author. It's a FK to login.

mysql> describe login;
+------------+--------------+------+-----+---------+----------------+
| Field      | Type         | Null | Key | Default | Extra          |
+------------+--------------+------+-----+---------+----------------+
| logid      | int(11)      |      | PRI | NULL    | auto_increment |
| username   | varchar(20)  | YES  |     | NULL    |                |
| password   | varchar(20)  | YES  |     | NULL    |                |
| email      | varchar(50)  | YES  |     | NULL    |                |
| phone      | varchar(50)  | YES  |     | NULL    |                |
| campus     | varchar(100) | YES  |     | NULL    |                |
| last       | varchar(50)  | YES  |     | NULL    |                |
| first      | varchar(50)  | YES  |     | NULL    |                |
| middle     | varchar(50)  | YES  |     | NULL    |                |
| pref_name  | varchar(50)  | YES  |     | NULL    |                |
| categoryid | int(11)      | YES  |     | NULL    |                |
| studentid  | int(11)      | YES  |     | NULL    |                |
| firstlang  | varchar(50)  | YES  |     | NULL    |                |
| created    | datetime     | YES  |     | NULL    |                |
| last_visit | datetime     | YES  |     | NULL    |                |
| last_ip    | varchar(15)  | YES  |     | NULL    |                |
| parent     | int(11)      | YES  |     | NULL    |                |
*** this is so that people can have multiple "ego"s for different groups
or classes or whatever unit of division you want. I thought about putting
"ego"s into a different table, but then decided it would be harder to
program because I'd have to figure out if it was a "person" or one of
their "ego"s that was posting something. This way everyone just gets a
logid entry (author) and then if the logid has a parent I know it's an
ego. Right now "guests" have their own table wchich just has id, name,
email, URL and IP connected from.

| timezone   | varchar(10)  | YES  |     | NULL    |                |
| locale     | int(11)      | YES  |     | NULL    |                |
+------------+--------------+------+-----+---------+----------------+
19 rows in set (0.00 sec)



mysql> describe content;
+------------------+--------------+------+-----+---------+----------------+
| Field            | Type         | Null | Key | Default | Extra          |
+------------------+--------------+------+-----+---------+----------------+
| id               | int(11)      |      | PRI | NULL    | auto_increment |
| en_title         | varchar(255) | YES  |     | NULL    |                |
| en_text          | text         | YES  |     | NULL    |                |
| fr_title         | varchar(255) | YES  |     | NULL    |                |
| fr_text          | text         | YES  |     | NULL    |                |
| personal_title   | varchar(255) | YES  |     | NULL    |                |
| personal_content | text         | YES  |     | NULL    |                |
| created          | datetime     | YES  |     | NULL    |                |
| updated          | datetime     | YES  |     | NULL    |                |
+------------------+--------------+------+-----+---------+----------------+
9 rows in set (0.00 sec)

I'm only pulling the _title from here, no "text" columns.


mysql> describe meta_comments;
+-------------+------------+------+-----+---------+----------------+
| Field       | Type       | Null | Key | Default | Extra          |
+-------------+------------+------+-----+---------+----------------+
| id          | int(11)    |      | PRI | NULL    | auto_increment |
| author      | int(11)    | YES  |     | NULL    |                |
| guest       | int(11)    | YES  |     | NULL    |                |
| subject     | int(11)    | YES  |     | NULL    |                |
| content     | int(11)    | YES  |     | NULL    |                |
| comment     | text       |      |     |         |***will be phased out
							in about 5 hours
							and replaced by above column
							which is an FK to
							"content"
| created     | datetime   | YES  |     | NULL    |                |
| updated     | datetime   | YES  |     | NULL    |                |
| parent      | int(11)    | YES  |     | NULL    |                |
| status      | tinyint(4) | YES  |     | NULL    |                |
| num_replies | int(11)    | YES  |     | NULL    |                |
| views       | int(11)    | YES  |     | NULL    |                |
| edits       | tinyint(4) | YES  |     | NULL    |                |
+-------------+------------+------+-----+---------+----------------+
13 rows in set (0.00 sec)

"content" is a FOREIGN KEY to the CONTENT table, not the actual body of text.

mysql> describe cats_article;
+---------+---------+------+-----+---------+-------+
| Field   | Type    | Null | Key | Default | Extra |
+---------+---------+------+-----+---------+-------+
| cat     | int(11) |      |     | 0       |       |
| article | int(11) |      |     | 0       |       |
+---------+---------+------+-----+---------+-------+
2 rows in set (0.00 sec)


On Mon, Mar 10, 2003 at 12:09:00AM -0500, rudy wrote:
> > left join cats_article on cats_article.article = meta_articles.id <--
> > allows multilingual content or content sorted by language
> > (one "article" can go into multiple categories/forums)
>
> be careful here
>
> when you join a "one" table, like the article table, with a "many" table,
> like article categories, the resulting rows will have the "one"
> columns repeated
> this is obvious, right?  it is the way a one-to-many query works

Yes it is obvious -- and all cats_article columns will be ignored because
of the following:
	where cats_article.cat = 2
It's also why I don't have it as part of my select, just part of the
conditions.


> however, it is disastrous when the "one" columns are large, as
> they would be with an article body, typically a large varchar
> or nvarchar or text or blob or what have you

There are no "text" columns selected in that query.

> > Here's my question: how do people feel about "wrecking"
> > database normalization and sticking (e.g.) author names
> > into the meta_comments table in addition to the author ID.
>
> well, i looked over your query, and it appears that this is already
> where the author name is coming from

No, it is not. See the actual tables. My question was whether or
not I /should/ put the author names in instead of just the FK to the
login table. I've had two recommendations (yours and one other) to leave
the author as only an FK. I thought it might improve the speed to
have the author name there as well. Apparently not. ;)

> > It would reduce the query by one join.
>
> in general, this is not something you worry about in the design phase
> of a project, but rather in the load testing phase
> design the relationships properly, and don't "wreck" any until you are
> forced to

Cool. That's what I needed to know.

> helps?

The last bit does. the rest doesn't really apply. Below is the original
query I sent to the list.

select
        meta_articles.id as id,
        content.en_title as subject, <--- language changes here by locale
                e.g. fr_title, it_title, ru_title (etc).
                The "content" table holds titles and content for virtually
                all text in the site (comments, articles, categories). It
                means that when a new locale is added only one table is
                updated and queries should be very easy to update. Simply
                change the locale for **_title and **_text. There's also
                an option for "personal annotations" in the locale/content
                table. I also have a setting for personal/public at the
                article/comment level and am not entirely sure which I'll
                use.
        meta_articles.created as created,
        meta_comments.updated as updated,
        meta_comments.author as who
from meta_articles
left join meta_comments on meta_comments.id = meta_articles.last_comment_id
left join cats_article on cats_article.article = meta_articles.id <--
                allows multilingual content or content sorted by language
                (one "article" can go into multiple categories/forums)
left join content on meta_articles.content = content.id
left join login on meta_comments.author = login.logid
where cats_article.cat = 2
order by meta_articles.created DESC


I've added one additional line to that query which checks to make sure an
en_title is set (i.e. != '' and is not NULL).

I'm really starting to dig the way the interface is shaping up. Right now
it looks something like this:

Is Canada drawing nearer to the United States?
Latest reply: 2003-03-09 16:22:53 By: Emma
Translation: en | fr |

And then you click on either the "en" or the "fr" to get the translated
text.

Entries that have no translation looks like this:
Should it be slowed down?
Latest reply: 2003-03-09 16:22:53 By: Emma
Translation: en |


Right now the link actually gives you all articles in the "translated"
category. I have to check with my client to see how they actually want the
interface to work. I've got a bunch of different functions that spit out
translated URLs for different units of content (e.g. categories,
articles).

Anything that resembles a good idea came from www.drupal.org. Anything
that resembles a bad idea is purely my own. ;)

emma

--
Emma Jane Hogbin
[[ 416 417 2868 ][ www.xtrinsic.com ]]



More information about the thelist mailing list