Re: SELECT DISTINCT ... ORDER BY problem - Mailing list pgsql-general

From David Rowley
Subject Re: SELECT DISTINCT ... ORDER BY problem
Date
Msg-id 5D54869CF0584B7A97A78E886D24F107@amd64
Whole thread Raw
In response to Re: SELECT DISTINCT ... ORDER BY problem  (Madison Kelly <linux@alteeve.com>)
List pgsql-general
Madison Kelly Wrote:
> David Fetter wrote:
> > On Mon, Dec 08, 2008 at 11:16:29PM -0000, David Rowley wrote:
> >>> -----Original Message-----
> >>> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> >>> owner@postgresql.org] On Behalf Of Madison Kelly
> >>> Sent: 08 December 2008 22:19
> >>> To: pgsql-general@postgresql.org
> >>> Subject: [GENERAL] SELECT DISTINCT ... ORDER BY problem
> >>>
> >>> Hi all,
> >>>
> >>>    I've got a table that I am trying to SELECT DISTINCT on one column
> >>> and ORDER BY on a second column, but am getting the error:
> >>>
> >>> SELECT DISTINCT ON expressions must match initial ORDER BY expressions
> >>>
> >>>    I can't add the second column to the DISTINCT clause because every
> >>> row is unique. Likewise, I can't add the first column to my ORDER BY
> as
> >>> it'd not sort the way I need it to.
> >>>
> >>>    Here is a simplified version of my query:
> >>>
> >>> \d table
> >>>                      Table "table"
> >>>       Column      |  Type   |                   Modifiers
> >>>
> >>> -----------------+---------+------------------------------------------
> ----
> >>> --
> >>>   tbl_id          | integer | not null default
> >>> nextval('tbl_seq'::regclass)
> >>>   foo             | text    |
> >>>   bar             | text    |
> >>>
> >>> SELECT DISTINCT ON (foo) foo, bar FROM table WHERE bar < '2008-12-07
> >>> 16:32:46' AND tbl_id=153 ORDER BY bar LIMIT 1;
> >>>
> >> To make the query valid you would have to ORDER BY foo,bar
> >> DISTINCT ON in this case is only going to show the first bar value for
> each
> >> foo.
> >>
> >> Is tbl_id not your PK and only giving 1 row anyway?
> >>
> >>>    I understand from:
> >>>
> >>> http://archives.postgresql.org/pgsql-sql/2007-02/msg00169.php
> >>>
> >>>    That this is not really possible because the any given 'foo' column
> >>> could match multiple 'bar' columns, so what do you search by? However,
> >>> it's made some sort of decision as a value is shown in 'bar' for each
> >>> 'foo'.
> >>>
> >>>    So my question is two-fold:
> >>>
> >>> 1. Can I not say, somehow, "sort all results by 'bar', and return the
> >>> first/last 'bar' for each distinct 'foo'?
> >>>
> >>> 2. Can I somehow say "Order the results using the value of 'bar' you
> >>> return, regardless of where it came from"?
> >> You can nest queries:
> >>
> >> SELECT foo,bar
> >> FROM (SELECT DISTINCT ON (foo) foo,
> >>                                Bar
> >>       FROM table
> >>       WHERE bar < '2008-12-07 16:32:46'
> >>         AND tbl_id=153 ORDER BY foo,bar
> >> ) AS t ORDER BY bar;
> >>
> >> Notice that I'm only applying the final order by in the outer query.
> >
> > When we get windowing functions, a lot of this pain will go away :)
> >
> > Cheers,
> > David.
>
> Oh?
>
>    I can't say I've been keeping up with what is in the pipes. What is
> windowing?
>

These are also known as analytical functions in some other database systems,
though by the standard they are known as window functions.

http://en.wikipedia.org/wiki/Select_(SQL)

It's worth a read. Hopefully we'll see this in 8.4.

David.


pgsql-general by date:

Previous
From: "David Rowley"
Date:
Subject: Re: SELECT DISTINCT ... ORDER BY problem
Next
From: "Adam Rich"
Date:
Subject: Re: SELECT DISTINCT ... ORDER BY problem