[thesite] friday is a good day..
rudy
r937 at interlog.com
Fri May 18 11:31:07 CDT 2001
> We need to rewrite the query so that it doesn't require the nested one.
i think you will find the alternative equally slow, because the alternative
is a LEFT JOIN with a GROUP BY on all the columns other than the Count(*)
i think the problem is the sheer volume of calls to the nested (correlated)
query, due to the number of articles returned
SELECT
content.contentid,
content.keyphrase,
content.contentname,
content.synopsis,
content.datemod,
content.replies,
WHERE
content.signoff = 1
AND content.categoryid <> 23
AND content.categoryid <> 26
AND content.categoryid <> 9741
i.e. all the active articles not in those three categories
so far so good, except for the fact that it will return about a thousand
articles
disclaimer: i don't know what the page is doing, whether it is caching the
results and/or how it is stepping through the "next 10 pages" so it might
be legitimate to go get all the articles -- but if the "next" type links do
*not* use cache and instead go and query the database again, then the above
is truly wasteful, and we should consider putting a date check on it to get
only the most recent articles or something
okay, now join to the user and category records
these joins go "from the many to the one" in the 1-to-many user-article and
category-article relationships, therefore these additional joins do not
increase the number of records in the result set --
SELECT
content.[columns],
users.[columns],
categorys.[columns]
FROM
content,
users,
categorys
WHERE
[content filters]
AND content.userid = users.userid
AND content.categoryid = categorys.categoryid
in anticipation of needing outer joins on this and other queries throughout
the site, we should try rewriting them using the sql92 syntax (or whatever
it's called) --
SELECT
content.[columns],
users.[columns],
categorys.[columns]
FROM
content
INNER JOIN users
ON content.userid = users.userid
INNER JOIN categorys
ON content.categoryid = categorys.categoryid
WHERE
[content filters]
now the way to add the nested query to the comments table would be as
follows --
SELECT
content.[columns],
users.[columns],
categorys.[columns],
Count(*) AS newcomments
FROM
content
INNER JOIN users
ON content.userid = users.userid
INNER JOIN categorys
ON content.categoryid = categorys.categoryid
LEFT JOIN replies
ON content.contentid = replies.contentid
WHERE
[content filters]
AND replies.datemod > ''
GROUP BY
content.[columns],
users.[columns],
categorys.[columns]
this is now going "from the one to the many" (if you know what i mean),
consequently you need the GROUP BY in order to keep the result set to one
row per article
i would be interested to see if oracle actually does better on this query
than the one using the nested correlated subquery, but my guess is it will
be just as slow
another alternative, and one that i personally think has a chance of maybe
performing really well, is to get the number of article comments in a
UNION -- but this requires that the page that is handling the result set be
changed, in order to handle the "extra rows" in the result set --
SELECT
content.contentid,
content.keyphrase,
content.contentname,
content.synopsis,
content.datemod,
content.replies,
users.[columns],
categorys.[columns]
FROM
content
INNER JOIN users
ON content.userid = users.userid
INNER JOIN categorys
ON content.categoryid = categorys.categoryid
WHERE
[content filters]
UNION ALL
SELECT
content.contentid,
' ',
'ZZZZZZZZZZZ',
' ',
content.datemod,
Count(replies.contentid)
, [more blanks or zeroes]
FROM
content
LEFT JOIN replies
ON content.contentid = replies.contentid
WHERE
[content filters]
GROUP BY
content.contentid,
content.datemod
ORDER BY
content.datemod DESC,
content.contentid DESC,
content.contentname DESC
with this approach, the page would use CFOUTPUT GROUP= and trap the
ZZZZZZZ row, which would come ahead of the "real" content row
the only advantage to doing this is that the GROUP BY is not performed in
the midst of a 4-way table join, rather only a 2-way join, which the
database engine might find a bit easier to do
however, my first attempt at query optimization would be to restrict the
query to returning no more rows than the page will display, if indeed the
"next 10 articles" feature does go back to the database...
does oracle support anything like microsoft's SELECT TOP or mysql's LIMIT?
rudy
More information about the thesite
mailing list