Thread: Possible parser bug? .... Re: Picking out the most recent row using a time stamp column

Possible parser bug? .... Re: Picking out the most recent row using a time stamp column

From

Dave Crooke

Date:

24 February 2011, 19:53:33

P.S. I noticed inadvertently (by making a typo ;-) that not all of the columns in the DISTINCT ON are required to be part of the output, in which case it appears to reduce the DISTINCT ON to the columns that are represented .... in my real world situation, "id_key" is actually composed of 3 columns, and I made a typo like the following (in which I've tweaked the spacing to highlight the missing comma:

select distinct on (a,    b,    c)
a,   b c,   time_stamp,    value
from data
order by a, b, c, time_stamp desc;

The output produced is the same as this query:

select distinct on (a,    b)
a,   b,   time_stamp,    value
from data
order by a, b, time_stamp desc;

Not sure if this is considered a parser bug or not, but it feels slightly odd not to get an error.

PG 8.4.7 installed from Ubuntu 10.04's 64-bit build.

Cheers
Dave

On Thu, Feb 24, 2011 at 5:38 PM, Dave Crooke <dcrooke@gmail.com> wrote:

Thanks to all .... I had a tickling feeling at the back of my mind that there was a neater answer here. For the record, times (all from in-memory cached data, averaged over a bunch of runs):

Dependent subquery = 117.9 seconds
Join to temp table = 2.7 sec
DISTINCT ON = 2.7 sec

So the DISTINCT ON may not be quicker, but it sure is tidier.

Cheers
Dave

On Thu, Feb 24, 2011 at 2:24 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
Michael Glaesemann <grzm@seespotcode.net> wrote:

> SELECT DISTINCT ON (data.id_key)
> data.id_key, data.time_stamp, data.value
> FROM data
> ORDER BY data.id_key, data.time_stamp DESC;

Dang! I forgot the DESC in my post! Thanks for showing the
*correct* version.

-Kevin

Re: Possible parser bug? .... Re: Picking out the most recent row using a time stamp column

From

Jochen Erwied

Date:

24 February 2011, 20:23:02

Friday, February 25, 2011, 12:53:08 AM you wrote:

> select distinct on (a,    b,    c)
> a,   b c,   time_stamp,    value

Without the comma, you declare 'b AS c'

> from data
> order by a, b, c, time_stamp desc;

> The output produced is the same as this query:

> select distinct on (a,    b)
> a,   b,   time_stamp,    value
> from data
> order by a, b, time_stamp desc;

the 'c' is optimized away, since it is an alias for b, and thus redundant
for the distinct.

> Not sure if this is considered a parser bug or not, but it feels slightly
> odd not to get an error.

No error, just plain SQL :-)



--
Jochen Erwied     |   home: jochen@erwied.eu     +49-208-38800-18, FAX: -19
Sauerbruchstr. 17 |   work: joe@mbs-software.de  +49-2151-7294-24, FAX: -50
D-45470 Muelheim  | mobile: jochen.erwied@vodafone.de       +49-173-5404164