Re: [SQL] SELECT DISTINCT and ORDER BY - Mailing list pgsql-sql

From Tom Lane
Subject Re: [SQL] SELECT DISTINCT and ORDER BY
Date
Msg-id 19160.937405597@sss.pgh.pa.us
Whole thread Raw
In response to SELECT DISTINCT and ORDER BY  (Joerg Eriskat <Eriskat@icf.med.uni-muenchen.de>)
List pgsql-sql
Joerg Eriskat <Eriskat@icf.med.uni-muenchen.de> writes:
> I'm not sure it is a bug, so I'm asking the SQL gurus.  I think
> there is a inconsistency with SELECT DISTINCT queries when used in
> combination with ORDER BY and column alias names (sorry for the rather
> long example):      

Looks like a bug to me.  This seems to be fixed in the current sources,
probably as a result of the changes I made to the representation of sort
ordering.  What is happening in 6.5.* is that the system is failing to
notice that the order-by values are the same as the ones being called
out as DISTINCT, so it generates additional columns and then sorts on
those instead of the DISTINCT ones.  6.6-to-be isn't fooled by the
aliases.

The underlying problem is still there, though: ordering by a column
not mentioned in the DISTINCT list doesn't do the right thing.
Arguably there is no right thing, and the code should raise an error,
but as of now it definitely isn't doing anything reasonable.  It would
take two sort steps to handle this: sort by the DISTINCT columns,
filter out adjacent duplicates, then sort by the ORDER BY columns.
But we only do one sort, and we use the ORDER BY columns (plus any
DISTINCT columns not already in ORDER BY).

However, there's more to this than meets the eye.  If you examine any
non-DISTINCT column after the filter step, then you are going to get a
randomly chosen one of the values associated with that set of DISTINCT
values.  So a second sort step on such a column after the DISTINCT
sort/filter would not give well-defined results.  That's why I think
maybe we should raise an error.  (I don't like "DISTINCT ON column"
either, since it is *necessarily* ill-defined.)
        regards, tom lane


pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] Permission problem with COPY FROM
Next
From: Tom Lane
Date:
Subject: ...