On Sun, Dec 14, 2003 at 22:17:35 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Bruno Wolff III <bruno@wolff.to> writes:
> > If the sort used to select the records sorts on both the distinct
> > expressions and the order by expressions you will get a sensible
> > deterministic result.
>
> Sensible in what sense? ;-)
Doing things as above is pretty much the same as normal distinct on
for purposes of which rows get selected. Of the possible rows that
might get returned for a specific set of values from the distinct on
expressions you will get the row that is first as ordered by the
expressions in the order by clause. If the order by clause isn't selective
enough there may be several rows that could be selected, but that is true
for how distinct on works now.
> It seems to me that the existing documentation defines the behavior of
> DISTINCT ON as selecting the row within each DISTINCT ON group that is
> first according to the ORDER BY columns that are less significant than
> the DISTINCT ON keys. Perhaps this is not clear enough and should be
> clarified. But it doesn't seem very useful to me to extend the behavior
> to allow other cases ... what are you really buying if you do so, and
> what will it cost in execution time?
What it buys is saving a bit of typing. The main slow down would be in trying
to determine whether you needed to use two sorts or just one in a particular
case. This shouldn't cost too much time, but might be an issue if there
are a lot of columns in both the distinct on and order by.
I do aggree that this isn't an area I would like to see developers working on.
I would much rather see developers work on things that add real functionallity
rather that that save some typing in unusual cases.