On Fri, 12 Sep 2003, Johnson, Shaunn wrote:
> Howdy:
>
> Can someone tell what the difference (and why
> you would use it) is between the following:
>
> [snip]
> select distinct on (col_1, col_2),
> col_1,
> col_2,
> col_3
> from t_table
>
> --
>
> select distinct
> col_1,
> col_2,
> col_3
> from t_table
> [/snip]
>
> In the first example, is it just getting
> the unique rows for the first two columns?
In the first, for each set of rows that have a distinct col1,col2
value it's taking one of those rows and using its col3 value.
It's like group by, but less restrictive since you don't need
to use a set function on col_3.
In general distinct on in that fashion is most usable when
combined with an order by so that you can get a particular row
from each set. For example, you might say do something like:
select distinct on (col1, col2) col1, col2, col3 from t_table
order by col1, col2, col4;
In this case you should get the col3 value for each col1,col2
distinct group that corresponds to the row having the lowest col4
value.