Thread: Stuck in "group by" aggregate hell
I've got a table with the following schema: Phone_num latlon location_when Each row holds a user's phone number, their location, and the time they were at that location. There can be up to 120 rows per phone_num, each with a (potentially) different latlon, and each with a different location_when (the primary key). My problem - how do you get a list of each phone_num's most recent position and time? I want to weed out everything but a user's most recent location, returning only one line per user. Here's an example: Phone_num latlon location_when 1111111111 22.12345,-90.12345 0901 1111111111 22.11111,-89.45678 0911 1111111111 21.99999,-89.55555 0921 2222222222 18.12334,-120.12345 1156 2222222222 18.10101,-120.11111 1206 2222222222 18.00001,-120.34889 1216 Given this, I want a list like this: 1111111111 21.99999,-89.55555 0921 2222222222 18.00001,-120.34889 1216 Obviously, it's something along these lines: Select *,min(age(now(),location_when)) From table Group by phone_num; Unfortunately, Postgres wants me to group by latlon and location_when, either of which makes each row a unique entity and causes me problems. I'd prefer to not use temp tables, but at this point I'll take any pointers I can get. Intersect? Some form of outer join with the same table? Thanks in advance! Bret
On Mon, 7 Feb 2005, Schuhmacher, Bret wrote: > I've got a table with the following schema: > Phone_num latlon location_when > > Each row holds a user's phone number, their location, and the time they > were at that location. There can be up to 120 rows per phone_num, each > with a (potentially) different latlon, and each with a different > location_when (the primary key). > > My problem - how do you get a list of each phone_num's most recent > position and time? I want to weed out everything but a user's most > recent location, returning only one line per user. > > Here's an example: > > Phone_num latlon location_when > 1111111111 22.12345,-90.12345 0901 > 1111111111 22.11111,-89.45678 0911 > 1111111111 21.99999,-89.55555 0921 > 2222222222 18.12334,-120.12345 1156 > 2222222222 18.10101,-120.11111 1206 > 2222222222 18.00001,-120.34889 1216 > > > Given this, I want a list like this: > 1111111111 21.99999,-89.55555 0921 > 2222222222 18.00001,-120.34889 1216 > > > > Obviously, it's something along these lines: > Select *,min(age(now(),location_when)) > From table > Group by phone_num; > > Unfortunately, Postgres wants me to group by latlon and location_when, > either of which makes each row a unique entity and causes me problems. > > I'd prefer to not use temp tables, but at this point I'll take any > pointers I can get. Intersect? Some form of outer join with the same > table? I believe the SQL way is to correlate the outside with a subquery so if just using the maximum location_when were sufficient (and there aren't nulls) I think you could do something like: select * from table where (phone_num, location_when) in (select phone_num, max(location_when) from table group by phone_num); In PostgreSQL, there's an extension which lets you do this slightly better in which case maybe something like this: select distinct on (phone_num) * from table order by phone_num, location_when desc.
Thank you, Stephan! Both work great! :-) I was not familiar with the Postgres extension method you showed me. I was also not aware you could use a "where...in" clause with multiple data elements! Thanks - I'll go (re) read up on those things... Thanks again! Bret > -----Original Message----- > From: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com] > Sent: Monday, February 07, 2005 9:15 AM > To: Schuhmacher, Bret > Cc: pgsql-novice@postgresql.org > Subject: Re: [NOVICE] Stuck in "group by" aggregate hell > > > On Mon, 7 Feb 2005, Schuhmacher, Bret wrote: > > > I've got a table with the following schema: > > Phone_num latlon location_when > > > > Each row holds a user's phone number, their location, and the time > > they were at that location. There can be up to 120 rows per > > phone_num, each with a (potentially) different latlon, and > each with > > a different location_when (the primary key). > > > > My problem - how do you get a list of each phone_num's most recent > > position and time? I want to weed out everything but a user's most > > recent location, returning only one line per user. > > > > Here's an example: > > > > Phone_num latlon location_when > > 1111111111 22.12345,-90.12345 0901 > > 1111111111 22.11111,-89.45678 0911 > > 1111111111 21.99999,-89.55555 0921 > > 2222222222 18.12334,-120.12345 1156 > > 2222222222 18.10101,-120.11111 1206 > > 2222222222 18.00001,-120.34889 1216 > > > > > > Given this, I want a list like this: > > 1111111111 21.99999,-89.55555 0921 > > 2222222222 18.00001,-120.34889 1216 > > > > > > > > Obviously, it's something along these lines: > > Select *,min(age(now(),location_when)) From table Group by > phone_num; > > > > Unfortunately, Postgres wants me to group by latlon and > location_when, > > either of which makes each row a unique entity and causes > me problems. > > > > I'd prefer to not use temp tables, but at this point I'll take any > > pointers I can get. Intersect? Some form of outer join > with the same > > table? > > I believe the SQL way is to correlate the outside with a > subquery so if just using the maximum location_when were > sufficient (and there aren't > nulls) I think you could do something like: > > select * from table where (phone_num, location_when) in > (select phone_num, max(location_when) from table group by phone_num); > > In PostgreSQL, there's an extension which lets you do this > slightly better in which case maybe something like this: > > select distinct on (phone_num) * from table order by > phone_num, location_when desc. >