Re: [SQL] Finding the "most recent" rows - Mailing list pgsql-sql

From Tom Lane
Subject Re: [SQL] Finding the "most recent" rows
Date
Msg-id 17966.924794971@sss.pgh.pa.us
Whole thread Raw
In response to Re: [SQL] Finding the "most recent" rows  (Chris Bitmead <chris.bitmead@bigfoot.com>)
List pgsql-sql
Chris Bitmead <chris.bitmead@bigfoot.com> writes:
> Julian Scarfe wrote:
>> I'd like an efficient way to pull out the most recent row (i.e. highest
>> datatime) belonging to *each* of a number of places selected by a simple
>> query.
>
> Try
> SELECT the_place, max(the_time) FROM the_place GROUP BY the_place;

But I'll bet he wants the whole row containing the max time, not just
the place and time columns.  I've run into similar problems and never
felt like I had a clean solution, either.

You could do something like

SELECT * FROM table AS t1 WHERE NOT
EXISTS(SELECT * FROM table AS t2 WHERE t2.place = t1.place AND t2.time > t1.time);

but this is ugly, and probably horribly inefficient as well.  (It might
not be unacceptably slow if the table has indexes on place and time,
but it sure looks like a brute-force approach.)

What you'd really like is something like a SELECT DISTINCT with a user-
specifiable row comparison operator; then you'd just "ORDER BY place, time"
and make a comparator that discards all but the last row for each place
value.  Hmm ... a little experimentation suggests that

SELECT DISTINCT ON place * FROM table ORDER BY place, time DESC;

might do the right thing.  It *seems* to select the first row for each
value of place.  I've never seen a spec for this feature, however, so
I'm not sure if it's reliable or not...
        regards, tom lane


pgsql-sql by date:

Previous
From: Michael J Davis
Date:
Subject: RE: [SQL] Finding the "most recent" rows
Next
From: Kyle Bateman
Date:
Subject: sum of two queries