Re: select distinct w/order by - Mailing list pgsql-general

From Richard Huxton
Subject Re: select distinct w/order by
Date
Msg-id 200404011052.27905.dev@archonet.com
Whole thread Raw
In response to select distinct w/order by  ("John Liu" <johnl@emrx.com>)
List pgsql-general
On Wednesday 31 March 2004 22:13, John Liu wrote:
> Tom provided the same logic arguments. I also like the way 'simple is
> better' as long as it sticks with SQL requirement. But in practice, you
> have to face such issue even it's 'catch 22' which depends on the
> application design -

Tom's a smart fella, of course he agreed with me ;-)
I'd argue PG does deal with the issue, by refusing to do handle an unsafe
situation. The "catch 22" only happens if a database supports vague queries.

> For the your case -
> code                  codeid
>
> AAA                       20
> BBB                        5
> CCC                       10
> BBB                       45
> CCC                       27
>
> When issue "select distinct code from test1 order by codeid;"
> One of the database returns using their internal rule (at least it's
> constant itself) -
> code
>
> BBB
> CCC
> AAA

Are you sure it's consistent? If you didn't know which results you were going
to get before testing it, how do you know it's right. Maybe when you tested
it, perhaps you got lucky. And, if the "theoretical" arguments don't convince
you, here's something fairly practical. If the behaviour isn't defined, and
it just happens to work this way, what guarantee will you have that another
database, or event the next version of your current one will give you the
same order?

> It provides one of the arguable result sets.

PG could provide one, but which one, and why should that be the right choice?

> But think about another situation, the result is for sure -
> code                  code2                 codeid
>
> a1                    a                          1
> a2                    a                          2
> b1                    d                          3
> b2                    d                          4
> c1                    c                          5
> c2                    c                          6
>
> select distinct code2 from test2 order by codeid;
> code2
>
> a
> d
> c
>
> It's handy.

You can't have a feature that only works for some cases. In your example,
there are no overlapping codeid ranges on any given code2. This means you can
use either of the min/max sorts I mentioned. PG needs to know what "order by
codeid" means.

> I hope everything is black or white, but it's not. The user has the choice
> at least. But when I use PostgreSQL, I need find an alternative solution to
> handle such issue.

Everything is black and white. You should be able to run the same queries on
the same data in any two databases and get the same results. If you define
your query correctly, that should be the case. (Note it's not your fault the
query is poorly defined, at first glance it looks like a sensible thing to
do. The fact is though, that it's not and the other database shouldn't let
you do it).

Below is a real example from one of my projects - both orders are valid, but
they give different results. You will need to choose one.

promise=> SELECT gal_cnttype FROM lkp_gallery GROUP BY gal_cnttype ORDER BY
min(gal_code);
 gal_cnttype
-------------
 CLILOGO
 MMS
 OPPLOGO
 PICMSG
 MONO
 POLY
 JAVA
 BUNDLE
 AISCRIPT
(9 rows)

promise=> SELECT gal_cnttype FROM lkp_gallery GROUP BY gal_cnttype ORDER BY
max(gal_code);
 gal_cnttype
-------------
 BUNDLE
 JAVA
 MONO
 POLY
 AISCRIPT
 CLILOGO
 MMS
 OPPLOGO
 PICMSG
(9 rows)


--
  Richard Huxton
  Archonet Ltd

pgsql-general by date:

Previous
From: Janning Vygen
Date:
Subject: Checking FKs after COPY and disabled Triggers
Next
From: Oleg Bartunov
Date:
Subject: Re: Wich hardware suits best for large full-text indexed