Re: [HACKERS] DISTINCT ON: speak now or forever hold your peace - Mailing list pgsql-sql

From Tom Lane
Subject Re: [HACKERS] DISTINCT ON: speak now or forever hold your peace
Date
Msg-id 18389.948912054@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] DISTINCT ON: speak now or forever hold your peace  (Peter Eisentraut <peter_e@gmx.net>)
Responses Re: [HACKERS] DISTINCT ON: speak now or forever hold your peace  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-sql
Peter Eisentraut <peter_e@gmx.net> writes:
> Our documents say that DISTINCT ON is equivalent to GROUP BY. I still
> don't see why that wouldn't be true. You can always rewrite
> select distinct on a a,b from test
> as
> select a, xxx(b) from test group by a
> where xxx is some aggregate function (presumably min or max).

Not really.  Look at Julian's example.  He can't rewrite as
select a, min(b), min(c) from test group by a

because the idea is to get the c that corresponds to the min b.
If you do it with two independent aggregates then the b and c
you get back may be from different tuples.

I could imagine fixing this with a two-input aggregate, say
select a, min(b), keyofmin(b, c) from test group by a

where keyofmin is defined to return the c associated with the min b.
But that'd be a pain to implement, first because we have no support
for multi-argument aggregates, and second because you'd need a ton
of separate keyofmin implementations for the cross-product of the
data types you might want to deal with.  So this is nearly as
klugy as the SELECT DISTINCT ON approach --- and not any more
standard, either.
        regards, tom lane


pgsql-sql by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: [HACKERS] DISTINCT ON: speak now or forever hold your peace
Next
From: "Peter Bojanic"
Date:
Subject: Help understanding how indexes are used by the query optimizer