SELECT DISTINCT ON ... ORDER BY ... - Mailing list pgsql-sql

From Thomas Metz
Subject SELECT DISTINCT ON ... ORDER BY ...
Date
Msg-id 36B05F8F.DFB94ED8@gsf.de
Whole thread Raw
Responses Re: [SQL] SELECT DISTINCT ON ... ORDER BY ...  ("Sergei M. Suntsov" <serge@uic.nsu.ru>)
Re: [SQL] SELECT DISTINCT ON ... ORDER BY ...  ("Oliver Elphick" <olly@lfix.co.uk>)
List pgsql-sql
I have the following problem:

Assuming the table TEST as follows:

ID     NAME
-----------------
1      Alex
2      Oliver
1      Thomas
2      Fenella


SELECT DISTINCT ON id id, name FROM test;
produces:

ID     NAME
-----------------
1      Alex
2      Oliver


SELECT DISTINCT ON id, name FROM test ORDER BY name;
produces:

ID     NAME
-----------------
1      Alex
2      Fenella
1      Thomas

I would have expected only two rows in both queries. I don't care which
names actually appear in the output as long as they are sorted, but
there should no longer be duplicate id's.

If the table is as follows:

ID     NAME
-----------------
2      Oliver
2      Alex
1      Thomas
1      Fenella


SELECT DISTINCT ON id id, name FROM test;
produces:

ID     NAME
-----------------
1      Thomas
2      Oliver


SELECT DISTINCT ON id, name FROM test ORDER BY name;
produces:

ID     NAME
-----------------
2      Alex
1      Fenella
2      Oliver
1      Thomas

What seems to happen is that from the sorted table, duplicate id's are
eliminated only if they are grouped. If there is no explicit sorting I
assume the DISTINCT performs an implicit sorting on id and then
eliminates records correctly. Is that the correct behaviour? Is there
another (simple) way of getting the results I want?

I am still using version 6.3

tm
--
Thomas Metz
GSF - National Research Center for Environment and Health
Institute of Mammalian Genetics

pgsql-sql by date:

Previous
From: Marcus Better
Date:
Subject: Re: [SQL] Serial numbers
Next
From: "Sergei M. Suntsov"
Date:
Subject: Re: [SQL] SELECT DISTINCT ON ... ORDER BY ...