Re: ORDER BY and DISTINCT ON - Mailing list pgsql-hackers

From Greg Stark
Subject Re: ORDER BY and DISTINCT ON
Date
Msg-id 87d6asw0rj.fsf@stark.dyndns.tv
Whole thread Raw
In response to ORDER BY and DISTINCT ON  (Neil Conway <neilc@samurai.com>)
Responses Re: ORDER BY and DISTINCT ON  (Neil Conway <neilc@samurai.com>)
Re: ORDER BY and DISTINCT ON  (Bruno Wolff III <bruno@wolff.to>)
Re: ORDER BY and DISTINCT ON  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Neil Conway <neilc@samurai.com> writes:

> We reject the following query:
> 
> nconway=# create table abc (a int, b int, c int);
> CREATE TABLE
> nconway=# select distinct on (a) a, b, c from abc order by b, c, a;
> ERROR:  SELECT DISTINCT ON expressions must match initial ORDER BY
> expressions

What would you expect to happen here?

Do you really want:

select distinct on (b,c,a) a,b,c from abc order by b,c,a;

or is that you want

select * from (select distinct on (a) a,b,c order by a) order by b,c,a;

Ie, pick a random record for each a and then sort by b,c?

Think of DISTINCT ON as a special form of GROUP BY that instead of doing
aggregate just returns the first record.

So, like DISTINCT ON, GROUP BY also insists on the user providing the ORDER BY
clause. I suppose you could argue postgres could implicitly introduce an extra
sort step when the user-provided ORDER BY doesn't match the GROUP BY or
DISTINCT ON clause but it seems like the user is probably confused if he
really wants a random record and then sort on columns that weren't sorted
previous to the DISTINCT ON.

-- 
greg



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: WITH clause
Next
From: Greg Stark
Date:
Subject: Re: Walker/mutator prototype.