Re: [SQL] Slashdot Query - Mailing list pgsql-sql

From Tom Lane
Subject Re: [SQL] Slashdot Query
Date
Msg-id 16847.928506113@sss.pgh.pa.us
Whole thread Raw
In response to Slashdot Query  (Chris Bitmead <chris.bitmead@bigfoot.com>)
List pgsql-sql
Chris Bitmead <chris.bitmead@bigfoot.com> writes:
> SELECT DISTINCT category.oid, category.title, category.image FROM story,
> category* WHERE story.category = category.oid AND story.approved ORDER
> BY datetime DESC LIMIT 5;

> The trouble is it doesn't return distinct results, but rather it returns
> duplicates.

This is a known problem --- SELECT DISTINCT implies sorting by the
fields that are being "distincted" on, since the actual duplicate-
elimination relies on a uniq(1)-like adjacent-duplicates filter.
Forcing a sort by a different field breaks the duplicate eliminator.

It is not real clear what the correct behavior is, which is why
nothing's been done about it; we've gone round on the issue a couple
of times (see the pghackers archives).  My own thought is that the
query as given above should be illegal, since there is no unique
value of datetime to go with a "distinct" set of oid, title, image.

You might consider using GROUP BY rather than DISTINCT if you want
to order the results in a particular way, saySELECT oid, title, image ...    GROUP BY oid, title, image ORDER BY
min(datetime)
where you use an aggregate like min or max to resolve the ambiguity
about which datetime to associate with a particular group...
        regards, tom lane


pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: [SQL] Mail about select
Next
From: Chris Bitmead
Date:
Subject: Re: [SQL] Slashdot Query