[thelist] MAX and GROUP BY question

Matt Warden mwarden at gmail.com
Wed Oct 24 20:01:34 CDT 2007

On 10/24/07, Noah St. Amand <noah at tookish.net> wrote:
> This query returns the most recent event_date for each client for program 1:
> "SELECT MAX(event_date), client_id FROM event WHERE program_id = 1 GROUP
> BY client_id"
> When I ask for status, though, in order to omit the 'COMPLETED' records
> programmatically, it returns the status of the oldest record for that
> client in that program, not the latest:
> "SELECT MAX(event_date), client_id, status FROM event WHERE program_id =
> 1 GROUP BY client_id"
> I still get the latest date back, but the status I get is not the one
> from that record, but rather the one from the earliest record for that
> client in that program. How can I get the status associated with the
> latest record?

That's because MySQL, in its infinite wisdom, is one of the few
databases that listens to the SQL standard on this point, which is
surprisingly silent. If you have:

event_date    client_id     status
10/19/2007        1       NOT STARTED
10/20/2007        1       IN PROGRESS
10/21/2007        2       NOT STARTED
10/25/2007        1       COMPLETED
10/26/2007        2       IN PROGRESS

And you run your query, which requests the maximum event_date and
*the* status for each client, then the question would be: what is the
status that should be displayed for client id 1 and client id 2? e.g.:

MAX(event_date)    client_id     status
10/25/2007              1       ??????
10/26/2007              2       ??????

There are three to pick from for client 1 and two to pick from for
client 2. You aren't telling the DB which you want. Every other
database I know will yell at you (but there is no mention of doing so
in the SQL standard) saying you must include 'status' in the group by
OR include it in an aggregate expression. I guess MySQL instead just
gives you the first occurs.

Back to your question, though. If you have the max event date for each
client, then you already have the key part. I believe this will get
the answer you're looking for:

select event_date, client_id, program_id
from event e1
where event_date = (
    select MAX(e2.event_date)
    from event e2
    where e2.program_id=e1.program_id
        and e2.client_id=e1.client_id
    and status != 'COMPLETED'
    and program_id = 1

You can even do a full report for all programs by taking out the last line.

Matt Warden
Cincinnati, OH, USA

This email proudly and graciously contributes to entropy.

More information about the thelist mailing list