[thelist] Problem with distinct

Tab Alleman talleman at Lumpsum.com
Fri Sep 7 16:47:30 CDT 2007


Use GROUP BY instead of SELECT DISTINCT?  Use aggregates on the non-grouped columns?

> -----Original Message-----
> From: thelist-bounces at lists.evolt.org
> [mailto:thelist-bounces at lists.evolt.org]On Behalf Of Bill Moseley
> Sent: Friday, September 07, 2007 5:05 PM
> To: thelist at lists.evolt.org
> Subject: [thelist] Problem with distinct
> 
> 
> This has come back to haunt me again.
> 
> I'm selling widgets, and each widget can have more than one size and
> sometimes more than one color.
> 
> So, as you would expect, a list of widgets will include the
> same widget more than once -- for each color and size.
> 
> We have a view that joins the tables, which is helpful:
> 
>     test=> select * from product_view;
>      id | widget_name | color |  size  
>     ----+-------------+-------+--------
>       1 | widget A    | red   | large
>       1 | widget A    | red   | medium
>       1 | widget A    | red   | small
>       1 | widget A    | blue  | large
>       1 | widget A    | blue  | medium
>       1 | widget A    | blue  | small
>       2 | widget B    | red   | large
>       2 | widget B    | red   | medium
>       2 | widget B    | red   | small
>       2 | widget B    | blue  | large
>       2 | widget B    | blue  | medium
>       2 | widget B    | blue  | small
>     (12 rows)
> 
> Now, the discussion goes like this:
> 
> Client:  "I want a list of widgets, and I want people to be able to
> select just "red" widgets, but I don't want to list a widget more
> than once".
> 
> 
> Well, you can't do this, of course, because you get the same widget
> more than once due to the multiple sizes:
> 
>     test=> select * from product_view where color = 'red';
>      id | widget_name | color |  size  
>     ----+-------------+-------+--------
>       1 | widget A    | red   | small
>       1 | widget A    | red   | medium
>       1 | widget A    | red   | large
>       2 | widget B    | red   | small
>       2 | widget B    | red   | medium
>       2 | widget B    | red   | large
>     (6 rows)
> 
> 
> Ok, so this is probably the way to go:
> 
>     SELECT * FROM product WHERE product.id IN
>         (
>             SELECT id FROM product_view WHERE color = 'red'
>         )
>     ORDER BY widget_name;
> 
>      id | widget_name 
>     ----+-------------
>       1 | widget A
>       2 | widget B
>     (2 rows)
> 
> 
> But here's the rub:
> 
> Client says:  "I want to be able to sort the widget list by 
> its color".
> 
> Developer: "But you can't do that because a widget might have more
> than one color.  Which color would would you sort by?"
> 
> Client: "90% of our widgets only have one color, so sort by 
> that color.  For
> the very few that have more than one color I don't care which 
> one is used
> for sorting."
> 
> 
> Hum.  Don't like using DISTINCT, but give it a try.  It would be handy
> to use that VIEW "product_view", after all, since there's a color
> column to sort by:
> 
>     test=> select distinct on (id) * from product_view;
>      id | widget_name | color | size  
>     ----+-------------+-------+-------
>       1 | widget A    | red   | large
>       2 | widget B    | red   | large
>     (2 rows)
> 
> Cool!  But wait:
> 
>     test=> select distinct on (id) * from product_view order by color;
>     ERROR:  SELECT DISTINCT ON expressions must match initial 
> ORDER BY expressions
> 
> Darn, that's not very helpful.  What else to do?
> 
> 
> 
> 
> 
> 
> -- 
> Bill Moseley
> moseley at hank.org
> 
> -- 
> 
> * * Please support the community that supports you.  * *
> http://evolt.org/help_support_evolt/
> 
> For unsubscribe and other options, including the Tip Harvester 
> and archives of thelist go to: http://lists.evolt.org 
> Workers of the Web, evolt ! 
> 



More information about the thelist mailing list