Re: cumulative count - Mailing list pgsql-general

From Harald Fuchs
Subject Re: cumulative count
Date
Msg-id pur64n4wpw.fsf@srv.protecting.net
Whole thread Raw
In response to cumulative count  (Carson Farmer <carson.farmer@gmail.com>)
Responses Re: cumulative count
List pgsql-general
In article <49381902.7080209@gmail.com>,
Carson Farmer <carson.farmer@gmail.com> writes:

>      date     |       user
> ------------------+---------------------
> 20050201   |       Bill
> 20050210   |       Steve
> 20050224   |       Sally
> 20050311   |       Martha
> 20050316   |       Ryan
> 20050322   |       Phil
> 20050330   |       William
> 20050415   |       Mary
> 20050428   |       Susan
> 20050503   |       Jim

> and I want to run a query that returns a *count* of the number of
> users *each month*, ordered by year and *month*, with an additional
> column that is a *running total of the count*, as in:

>     year        |    month  |    count    |   run_count
> -------------------+----------------+----------------+-----------------
>     2005       |      02       |      3          |         3
> 2005       |      03       |      4          |         7
>     2005       |      04       |      2          |         9
>     2005       |      05       |      1          |        10

> I can get almost everything I want with:

> SELECT a.year, a.month, a.count, a.month_name, sum(b.count) AS total
> FROM (SELECT EXTRACT(year from added_date) AS year,
> EXTRACT(month FROM added_date) AS month,
> TO_CHAR(added_date, 'Month') AS month_name,
> COUNT(*) AS count FROM users_table GROUP BY 1, 2, 3) AS a,
> (SELECT EXTRACT(year FROM added_date) AS year,
> EXTRACT(month FROM added_date) AS month,
> TO_CHAR(added_date, 'Month') AS month_name,
> COUNT(*) AS count
> FROM users_table GROUP BY 1, 2, 3) AS b
> WHERE a.year >= b.year AND a.month >= b.month
> GROUP BY 1, 2, 3, 4
> ORDER BY a.year, a.month asc;

> but I can't quite figure out the running total of the count. The above
> example works right up to the end of the first year, then the values
> no longer make sense. My guess is it's something to do with my WHERE
> clause, but I can't think of a better way to do things.

Yes, your WHERE condition is the problem.  It should be
WHERE a.year > b.year OR a.year = b.year AND a.month > b.month.

You could simplify the date logic by doing the year/month split later, e.g.

  CREATE TEMP TABLE tmp AS
  SELECT date_trunc('month', date) AS dt, count(*) AS count
  FROM users_table
  GROUP BY dt;

  SELECT extract(YEAR FROM t1.dt) AS year,
         extract(MONTH FROM t1.dt) AS month,
         t1.count,
         sum(t2.count) AS run_count
  FROM tmp t1
  LEFT JOIN tmp t2 ON t2.dt <= t1.dt
  GROUP BY year, month, t1.count
  ORDER BY year, month;

(AFAIK the WITH clause in PostgreSQL 8.4 would let you get rid of
the temp table.)

pgsql-general by date:

Previous
From: Carson Farmer
Date:
Subject: cumulative count
Next
From: Gerhard Heift
Date:
Subject: Re: cumulative count