Thread: select progressive total

select progressive total

From
brian
Date:
 From the "I thought this would be trivially easy" dept:

I have a table holding member data for an organisation

CREAT table member (
    id    SERIAL PRIMARY KEY,
    applied    date,
...

and i'd like to plot the growth in memberships. I'd prefer to hand off
the data already totaled, if possible.

'2006-02-01', 452
'2006-02-17', 453
...

It semed to me that a self join was the way to go, but that route
produced some flaky results. Here's my latest attempt:

SELECT m1.applied AS date_applied, count(m2.id) AS num_applications
    FROM member m1, member m2 WHERE m1.applied <= m2.applied
    GROUP BY m1.applied ORDER BY date_applied ASC;

The thing is, some applications fell on the same day, so i'm not even
sure that this approach would work. Can anyone see where i'm going wrong
here? And how can i handle applications that occurred on the same day?
Can this even be done in a SELECT, or should i create a function?

b

Re: select progressive total

From
"Merlin Moncure"
Date:
On 3/21/07, brian <brian@zijn-digital.com> wrote:
>  From the "I thought this would be trivially easy" dept:
>
> I have a table holding member data for an organisation
>
> CREAT table member (
>         id      SERIAL PRIMARY KEY,
>         applied date,
> ...
>
> and i'd like to plot the growth in memberships. I'd prefer to hand off
> the data already totaled, if possible.
>
> '2006-02-01', 452
> '2006-02-17', 453
> ...
>
> It semed to me that a self join was the way to go, but that route
> produced some flaky results. Here's my latest attempt:
>
> SELECT m1.applied AS date_applied, count(m2.id) AS num_applications
>         FROM member m1, member m2 WHERE m1.applied <= m2.applied
>         GROUP BY m1.applied ORDER BY date_applied ASC;
>
> The thing is, some applications fell on the same day, so i'm not even
> sure that this approach would work. Can anyone see where i'm going wrong
> here? And how can i handle applications that occurred on the same day?
> Can this even be done in a SELECT, or should i create a function?

you can do this with a subquery:

select
  applied_date,
  count(*),
  (select count(*) from member m2 where m2.applied_date <
m1.applied_date) as total
  from member m1
  group by 1;

performance is going to suck for large tables because the subquery
will run once for every row returned by the group.

merlin

Re: select progressive total

From
brian
Date:
Merlin Moncure wrote:
> On 3/21/07, brian <brian@zijn-digital.com> wrote:
>
>>  From the "I thought this would be trivially easy" dept:
>>
>> I have a table holding member data for an organisation
>>
>> CREAT table member (
>>         id      SERIAL PRIMARY KEY,
>>         applied date,
>> ...
>>
>> and i'd like to plot the growth in memberships. I'd prefer to hand off
>> the data already totaled, if possible.
>>
>> '2006-02-01', 452
>> '2006-02-17', 453
>> ...
>>
>> It semed to me that a self join was the way to go, but that route
>> produced some flaky results. Here's my latest attempt:
>>
>> SELECT m1.applied AS date_applied, count(m2.id) AS num_applications
>>         FROM member m1, member m2 WHERE m1.applied <= m2.applied
>>         GROUP BY m1.applied ORDER BY date_applied ASC;
>>
>> The thing is, some applications fell on the same day, so i'm not even
>> sure that this approach would work. Can anyone see where i'm going wrong
>> here? And how can i handle applications that occurred on the same day?
>> Can this even be done in a SELECT, or should i create a function?
>
>
> you can do this with a subquery:
>
> select
>  applied_date,
>  count(*),
>  (select count(*) from member m2 where m2.applied_date <
> m1.applied_date) as total
>  from member m1
>  group by 1;
>

Almost. The total column is less than the correct amount by whatever
count returns.

> performance is going to suck for large tables because the subquery
> will run once for every row returned by the group.
>

Yeah, i realise. In the end, i think i'll do the totalling outside of
Postgres. When plotting the data, i'll be sending it as an assoc. array,
like: '2006-05-27'=>3049 so i need to do some post-processing in any case.

b