Thread: Stuck in "group by" aggregate hell

Stuck in "group by" aggregate hell

From

"Schuhmacher, Bret"

Date:

07 February 2005, 11:34:16

I've got a table with the following schema:
Phone_num        latlon        location_when

Each row holds a user's phone number, their location, and the time they
were at that location.  There can be up to 120 rows per phone_num, each
with a  (potentially) different latlon, and each with a different
location_when (the primary key).

My problem - how do you get a list of each phone_num's most recent
position and time?  I want to weed out everything but a user's most
recent location, returning only one line per user.

Here's an example:

Phone_num            latlon            location_when
1111111111            22.12345,-90.12345    0901
1111111111            22.11111,-89.45678    0911
1111111111            21.99999,-89.55555    0921
2222222222            18.12334,-120.12345    1156
2222222222            18.10101,-120.11111    1206
2222222222            18.00001,-120.34889    1216


Given this, I want a list like this:
1111111111            21.99999,-89.55555    0921
2222222222            18.00001,-120.34889    1216



Obviously, it's something along these lines:
Select *,min(age(now(),location_when))
From table
Group by phone_num;

Unfortunately, Postgres wants me to group by latlon and location_when,
either of which makes each row a unique entity and causes me problems.

I'd prefer to not use temp tables, but at this point I'll take any
pointers I can get.  Intersect?  Some form of outer join with the same
table?

Thanks in advance!

Bret

Re: Stuck in "group by" aggregate hell

From

Stephan Szabo

Date:

07 February 2005, 14:14:46

On Mon, 7 Feb 2005, Schuhmacher, Bret wrote:

> I've got a table with the following schema:
> Phone_num        latlon        location_when
>
> Each row holds a user's phone number, their location, and the time they
> were at that location.  There can be up to 120 rows per phone_num, each
> with a  (potentially) different latlon, and each with a different
> location_when (the primary key).
>
> My problem - how do you get a list of each phone_num's most recent
> position and time?  I want to weed out everything but a user's most
> recent location, returning only one line per user.
>
> Here's an example:
>
> Phone_num            latlon            location_when
> 1111111111            22.12345,-90.12345    0901
> 1111111111            22.11111,-89.45678    0911
> 1111111111            21.99999,-89.55555    0921
> 2222222222            18.12334,-120.12345    1156
> 2222222222            18.10101,-120.11111    1206
> 2222222222            18.00001,-120.34889    1216
>
>
> Given this, I want a list like this:
> 1111111111            21.99999,-89.55555    0921
> 2222222222            18.00001,-120.34889    1216
>
>
>
> Obviously, it's something along these lines:
> Select *,min(age(now(),location_when))
> From table
> Group by phone_num;
>
> Unfortunately, Postgres wants me to group by latlon and location_when,
> either of which makes each row a unique entity and causes me problems.
>
> I'd prefer to not use temp tables, but at this point I'll take any
> pointers I can get.  Intersect?  Some form of outer join with the same
> table?

I believe the SQL way is to correlate the outside with a subquery so
if just using the maximum location_when were sufficient (and there aren't
nulls) I think you could do something like:

select * from table where (phone_num, location_when) in
 (select phone_num, max(location_when) from table group by phone_num);

In PostgreSQL, there's an extension which lets you do this slightly better
in which case maybe something like this:

select distinct on (phone_num) * from table order by phone_num,
location_when desc.

Re: Stuck in "group by" aggregate hell

From

"Schuhmacher, Bret"

Date:

07 February 2005, 14:37:19

Thank you, Stephan!  Both work great! :-)  I was not familiar with the
Postgres extension method you showed me.  I was also not aware you could
use a "where...in" clause with multiple data elements!  Thanks - I'll go
(re) read up on those things...

Thanks again!

Bret

> -----Original Message-----
> From: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com]
> Sent: Monday, February 07, 2005 9:15 AM
> To: Schuhmacher, Bret
> Cc: pgsql-novice@postgresql.org
> Subject: Re: [NOVICE] Stuck in "group by" aggregate hell
>
>
> On Mon, 7 Feb 2005, Schuhmacher, Bret wrote:
>
> > I've got a table with the following schema:
> > Phone_num        latlon        location_when
> >
> > Each row holds a user's phone number, their location, and the time
> > they were at that location.  There can be up to 120 rows per
> > phone_num, each with a  (potentially) different latlon, and
> each with
> > a different location_when (the primary key).
> >
> > My problem - how do you get a list of each phone_num's most recent
> > position and time?  I want to weed out everything but a user's most
> > recent location, returning only one line per user.
> >
> > Here's an example:
> >
> > Phone_num            latlon            location_when
> > 1111111111            22.12345,-90.12345    0901
> > 1111111111            22.11111,-89.45678    0911
> > 1111111111            21.99999,-89.55555    0921
> > 2222222222            18.12334,-120.12345    1156
> > 2222222222            18.10101,-120.11111    1206
> > 2222222222            18.00001,-120.34889    1216
> >
> >
> > Given this, I want a list like this:
> > 1111111111            21.99999,-89.55555    0921
> > 2222222222            18.00001,-120.34889    1216
> >
> >
> >
> > Obviously, it's something along these lines:
> > Select *,min(age(now(),location_when)) From table Group by
> phone_num;
> >
> > Unfortunately, Postgres wants me to group by latlon and
> location_when,
> > either of which makes each row a unique entity and causes
> me problems.
> >
> > I'd prefer to not use temp tables, but at this point I'll take any
> > pointers I can get.  Intersect?  Some form of outer join
> with the same
> > table?
>
> I believe the SQL way is to correlate the outside with a
> subquery so if just using the maximum location_when were
> sufficient (and there aren't
> nulls) I think you could do something like:
>
> select * from table where (phone_num, location_when) in
> (select phone_num, max(location_when) from table group by phone_num);
>
> In PostgreSQL, there's an extension which lets you do this
> slightly better in which case maybe something like this:
>
> select distinct on (phone_num) * from table order by
> phone_num, location_when desc.
>