Re: Extracting superlatives - SQL design philosophy - Mailing list pgsql-performance
From | George Sexton |
---|---|
Subject | Re: Extracting superlatives - SQL design philosophy |
Date | |
Msg-id | 005a01cab59c$676fdb70$364f9250$@com Whole thread Raw |
In response to | Extracting superlatives - SQL design philosophy (Dave Crooke <dcrooke@gmail.com>) |
Responses |
Re: Extracting superlatives - SQL design philosophy
|
List | pgsql-performance |
You could do: select B.City, MaxCityTemp.Temp, min(B.Date) as FirstMaxDate from bar b INNER JOIN (select city,max(temp) as Temp from Bar group by City) as MaxCityTemp ON B.City=MaxCityTemp.City Group by B.City, MaxCityTemp.Temp George Sexton MH Software, Inc. http://www.mhsoftware.com/ Voice: 303 438 9585 > -----Original Message----- > From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance- > owner@postgresql.org] On Behalf Of Dave Crooke > Sent: Wednesday, February 24, 2010 2:31 PM > To: pgsql-performance > Subject: [PERFORM] Extracting superlatives - SQL design philosophy > > This is a generic SQL issue and not PG specific, but I'd like to get > an opinion from this list. > > Consider the following data: > > # \d bar > Table "public.bar" > Column | Type | Modifiers > --------+-----------------------------+----------- > city | character varying(255) | > temp | integer | > date | timestamp without time zone | > > # select * from bar order by city, date; > city | temp | date > -----------+------+--------------------- > Austin | 75 | 2010-02-21 15:00:00 > Austin | 35 | 2010-02-23 15:00:00 > Edinburgh | 42 | 2010-02-23 15:00:00 > New York | 56 | 2010-02-23 15:00:00 > New York | 78 | 2010-06-23 15:00:00 > (5 rows) > > If you want the highest recorded temperature for a city, that's easy > to do, since the selection criteria works on the same column that we > are extracing: > > # select city, max(temp) from bar group by city order by 1; > city | max > -----------+----- > Austin | 75 > Edinburgh | 42 > New York | 78 > (3 rows) > > > However there is (AFAIK) no simple way in plain SQL to write a query > that performs such an aggregation where the aggregation criteria is on > one column and you want to return another, e.g. adding the the *date > of* that highest temperature to the output above, or doing a query to > get the most recent temperature reading for each city. > > What I'd like to do is something like the below (and I'm inventing > mock syntax here, the following is not valid SQL): > > -- Ugly implicit syntax but no worse than an Oracle outer join ;-) > select city, temp, date from bar where date=max(date) group by city, > temp order by city; > > or perhaps > > -- More explicit > select aggregate_using(max(date), city, temp, date) from bar group by > city, temp order by city; > > Both of the above, if they existed, would be a single data access > followed by and sort-merge. > > The only way I know how to do it involves doing two accesses to the > data, e.g. > > # select city, temp, date from bar a where date=(select max(b.date) > from bar b where a.city=b.city) order by 1; > city | temp | date > -----------+------+--------------------- > Austin | 35 | 2010-02-23 15:00:00 > Edinburgh | 42 | 2010-02-23 15:00:00 > New York | 78 | 2010-06-23 15:00:00 > (3 rows) > > > # explain select * from bar a where date=(select max(b.date) from bar > b where a.city=b.city) order by 1; > QUERY PLAN > ----------------------------------------------------------------------- > --- > Sort (cost=1658.86..1658.87 rows=1 width=528) > Sort Key: a.city > -> Seq Scan on bar a (cost=0.00..1658.85 rows=1 width=528) > Filter: (date = (subplan)) > SubPlan > -> Aggregate (cost=11.76..11.77 rows=1 width=8) > -> Seq Scan on bar b (cost=0.00..11.75 rows=1 > width=8) -- would be an index lookup in a real scenario > Filter: (($0)::text = (city)::text) > (8 rows) > > -- > Sent via pgsql-performance mailing list (pgsql- > performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
pgsql-performance by date: