[thelist] Need help with difficult MySQL query
rudy
r937 at interlog.com
Fri Jul 12 20:32:01 CDT 2002
hi simon
congrats for rolling your own cms
i understand your predicament, and have a couple of ideas
in your query,
select entries.*, categories.*
from entries, categories, ent_cat_lookup
where entries.id = ent_cat_lookup.entryId
and categories.shortName = ent_cat_lookup.categoryShortName
order by entries.dateAdded desc
presumable you have limit 0,19 on this, yes? note that
this will be 20 result rows, including "duplicates" for
entries in more than one category, so actually there could
be fewer than 20 distinct entries
here are the alternatives as i see them
1. accept that occasional entries will be in more than one
category, and continue to return results using the above query
process the result set in a loop which is written to detect
the duplicate entries, collecting the multiple categories
for a given entry until it's gone past the last row for that
entry (this is prev/next logic which should be straightforward)
the downside is that large entries take up a lot of space in
the result set being passed back from the database, so this might
slow down the result set retrieval
but if you're talking about 20 rows, you would probably never
notice the performance hit
also, if your intention is to print 20 at a time, you should
probably go ahead and bring back 30 or so, and stop printing
after you've done 20 distinct entries in that loop
2. break the query up into two steps
select *
from entries
order by dateAdded desc
limit 0,19
pull out the entry ids into a comma-delimited list, $idlist
(i don't do php -- is that how you name a string variable?)
and do another query (note to john: you almost never
should do queries in a loop!)
select entries.id, categories.*
from entries
inner join ent_cat_lookup
on entries.id = ent_cat_lookup.entryId
inner join categories
on ent_cat_lookup.categoryShortName = categories.shortName
where entries.id in ( $idlist )
order by entries.id
note that mysql supports the standard sql join syntax, which
is preferable because it's easier to write and to comprehend
the 2-step strategy eliminates duplication in the entries, and
ensures you get exactly 20 distinct entries
it's probably marginally slower because it requires two calls to
the database (not always a bad thing, and often unavoidable in mysql
when a subquery would have been involved) but that might be offset by
faster processing of the entries query
however, it places an extra burden on your programming, because now
you have to match entries to their (multiple) categories yourself
this might not seem insurmountable at first, but will have greater
and greater impact if the queries change
for instance, suppose you wanted to bring back (multiple) comments
with each entry?
luckily, you can still do this in two steps -- one to get entries plus
comments, and the second to get categories per entry, but the matching
logic would get more complex
the one-step process would at this point be undesirable -- the duplication
involved in entries plus multiple comments plus multiple categories is not
something i'd want to ship over from the database in one result set
there are other solutions, involving bastardizing the lovely (and
quite proper) many-to-many table design you have there, so let's
not look at those
hoping this helps
continued success with your blog
rudy
More information about the thelist
mailing list