Thread: cumulative count
Hi list, This is my first post to pgsql, so hopefully I'm not asking something that has been answered a thousand time before. I've looked online, and through the archives, but I haven't found anything that answers my question specifically: Say I have a table like this: date | user ------------------+--------------------- 20050201 | Bill 20050210 | Steve 20050224 | Sally 20050311 | Martha 20050316 | Ryan 20050322 | Phil 20050330 | William 20050415 | Mary 20050428 | Susan 20050503 | Jim and I want to run a query that returns a *count* of the number of users *each month*, ordered by year and *month*, with an additional column that is a *running total of the count*, as in: year | month | count | run_count -------------------+----------------+----------------+----------------- 2005 | 02 | 3 | 3 2005 | 03 | 4 | 7 2005 | 04 | 2 | 9 2005 | 05 | 1 | 10 I can get almost everything I want with: SELECT a.year, a.month, a.count, a.month_name, sum(b.count) AS total FROM (SELECT EXTRACT(year from added_date) AS year, EXTRACT(month FROM added_date) AS month, TO_CHAR(added_date, 'Month') AS month_name, COUNT(*) AS count FROM users_table GROUP BY 1, 2, 3) AS a, (SELECT EXTRACT(year FROM added_date) AS year, EXTRACT(month FROM added_date) AS month, TO_CHAR(added_date, 'Month') AS month_name, COUNT(*) AS count FROM users_table GROUP BY 1, 2, 3) AS b WHERE a.year >= b.year AND a.month >= b.month GROUP BY 1, 2, 3, 4 ORDER BY a.year, a.month asc; but I can't quite figure out the running total of the count. The above example works right up to the end of the first year, then the values no longer make sense. My guess is it's something to do with my WHERE clause, but I can't think of a better way to do things. Any ideas? Cheers, Carson
In article <49381902.7080209@gmail.com>, Carson Farmer <carson.farmer@gmail.com> writes: > date | user > ------------------+--------------------- > 20050201 | Bill > 20050210 | Steve > 20050224 | Sally > 20050311 | Martha > 20050316 | Ryan > 20050322 | Phil > 20050330 | William > 20050415 | Mary > 20050428 | Susan > 20050503 | Jim > and I want to run a query that returns a *count* of the number of > users *each month*, ordered by year and *month*, with an additional > column that is a *running total of the count*, as in: > year | month | count | run_count > -------------------+----------------+----------------+----------------- > 2005 | 02 | 3 | 3 > 2005 | 03 | 4 | 7 > 2005 | 04 | 2 | 9 > 2005 | 05 | 1 | 10 > I can get almost everything I want with: > SELECT a.year, a.month, a.count, a.month_name, sum(b.count) AS total > FROM (SELECT EXTRACT(year from added_date) AS year, > EXTRACT(month FROM added_date) AS month, > TO_CHAR(added_date, 'Month') AS month_name, > COUNT(*) AS count FROM users_table GROUP BY 1, 2, 3) AS a, > (SELECT EXTRACT(year FROM added_date) AS year, > EXTRACT(month FROM added_date) AS month, > TO_CHAR(added_date, 'Month') AS month_name, > COUNT(*) AS count > FROM users_table GROUP BY 1, 2, 3) AS b > WHERE a.year >= b.year AND a.month >= b.month > GROUP BY 1, 2, 3, 4 > ORDER BY a.year, a.month asc; > but I can't quite figure out the running total of the count. The above > example works right up to the end of the first year, then the values > no longer make sense. My guess is it's something to do with my WHERE > clause, but I can't think of a better way to do things. Yes, your WHERE condition is the problem. It should be WHERE a.year > b.year OR a.year = b.year AND a.month > b.month. You could simplify the date logic by doing the year/month split later, e.g. CREATE TEMP TABLE tmp AS SELECT date_trunc('month', date) AS dt, count(*) AS count FROM users_table GROUP BY dt; SELECT extract(YEAR FROM t1.dt) AS year, extract(MONTH FROM t1.dt) AS month, t1.count, sum(t2.count) AS run_count FROM tmp t1 LEFT JOIN tmp t2 ON t2.dt <= t1.dt GROUP BY year, month, t1.count ORDER BY year, month; (AFAIK the WITH clause in PostgreSQL 8.4 would let you get rid of the temp table.)
On Thu, Dec 04, 2008 at 07:32:59PM +0100, Harald Fuchs wrote: > In article <49381902.7080209@gmail.com>, > Carson Farmer <carson.farmer@gmail.com> writes: > > > date | user > > ------------------+--------------------- > > 20050201 | Bill > > 20050210 | Steve > > 20050224 | Sally > > 20050311 | Martha > > 20050316 | Ryan > > 20050322 | Phil > > 20050330 | William > > 20050415 | Mary > > 20050428 | Susan > > 20050503 | Jim > > > and I want to run a query that returns a *count* of the number of > > users *each month*, ordered by year and *month*, with an additional > > column that is a *running total of the count*, as in: > > > year | month | count | run_count > > -------------------+----------------+----------------+----------------- > > 2005 | 02 | 3 | 3 > > 2005 | 03 | 4 | 7 > > 2005 | 04 | 2 | 9 > > 2005 | 05 | 1 | 10 > > > I can get almost everything I want with: > > > SELECT a.year, a.month, a.count, a.month_name, sum(b.count) AS total > > FROM (SELECT EXTRACT(year from added_date) AS year, > > EXTRACT(month FROM added_date) AS month, > > TO_CHAR(added_date, 'Month') AS month_name, > > COUNT(*) AS count FROM users_table GROUP BY 1, 2, 3) AS a, > > (SELECT EXTRACT(year FROM added_date) AS year, > > EXTRACT(month FROM added_date) AS month, > > TO_CHAR(added_date, 'Month') AS month_name, > > COUNT(*) AS count > > FROM users_table GROUP BY 1, 2, 3) AS b > > WHERE a.year >= b.year AND a.month >= b.month > > GROUP BY 1, 2, 3, 4 > > ORDER BY a.year, a.month asc; > > > but I can't quite figure out the running total of the count. The above > > example works right up to the end of the first year, then the values > > no longer make sense. My guess is it's something to do with my WHERE > > clause, but I can't think of a better way to do things. > > Yes, your WHERE condition is the problem. It should be > WHERE a.year > b.year OR a.year = b.year AND a.month > b.month. > > You could simplify the date logic by doing the year/month split later, e.g. > > CREATE TEMP TABLE tmp AS > > SELECT extract(YEAR FROM t1.dt) AS year, > extract(MONTH FROM t1.dt) AS month, > t1.count, > sum(t2.count) AS run_count > FROM tmp t1 > LEFT JOIN tmp t2 ON t2.dt <= t1.dt > GROUP BY year, month, t1.count > ORDER BY year, month; What about: SELECT extract(YEAR FROM t1.dt) AS year, extract(MONTH FROM t1.dt) AS month, t1.count, sum(t2.count) AS run_count FROM ( SELECT date_trunc('month', date) AS dt, count(*) AS count FROM users_table GROUP BY dt ) AS t1 LEFT JOIN tmp t2 ON t2.dt <= t1.dt GROUP BY year, month, t1.count ORDER BY year, month; Regards, Gerhard
Attachment
2008/12/4 Carson Farmer <carson.farmer@gmail.com>: > Hi list, > > This is my first post to pgsql, so hopefully I'm not asking something that > has been answered a thousand time before. I've looked online, and through > the archives, but I haven't found anything that answers my question > specifically: > > Say I have a table like this: > > date | user > ------------------+--------------------- > 20050201 | Bill > 20050210 | Steve > 20050224 | Sally > 20050311 | Martha > 20050316 | Ryan > 20050322 | Phil > 20050330 | William > 20050415 | Mary > 20050428 | Susan > 20050503 | Jim > > and I want to run a query that returns a *count* of the number of users > *each month*, ordered by year and *month*, with an additional column that is > a *running total of the count*, as in: > > year | month | count | run_count > -------------------+----------------+----------------+----------------- > 2005 | 02 | 3 | 3 2005 > | 03 | 4 | 7 > 2005 | 04 | 2 | 9 > 2005 | 05 | 1 | 10 > > I can get almost everything I want with: > > SELECT a.year, a.month, a.count, a.month_name, sum(b.count) AS total > FROM (SELECT EXTRACT(year from added_date) AS year, > EXTRACT(month FROM added_date) AS month, > TO_CHAR(added_date, 'Month') AS month_name, > COUNT(*) AS count FROM users_table GROUP BY 1, 2, 3) AS a, > (SELECT EXTRACT(year FROM added_date) AS year, > EXTRACT(month FROM added_date) AS month, > TO_CHAR(added_date, 'Month') AS month_name, > COUNT(*) AS count > FROM users_table GROUP BY 1, 2, 3) AS b > WHERE a.year >= b.year AND a.month >= b.month > GROUP BY 1, 2, 3, 4 > ORDER BY a.year, a.month asc; > > but I can't quite figure out the running total of the count. The above > example works right up to the end of the first year, then the values no > longer make sense. My guess is it's something to do with my WHERE clause, > but I can't think of a better way to do things. > > Any ideas? > hate selfjoins. It is really slow for any bigger datasets. Write SRF function (stored function that returns table). Regards Pavel Stehule > Cheers, > > Carson > > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
On Thu, Dec 04, 2008 at 05:53:06PM +0000, Carson Farmer wrote: > Hi list, > > This is my first post to pgsql, so hopefully I'm not asking something > that has been answered a thousand time before. I've looked online, and > through the archives, but I haven't found anything that answers my > question specifically: > > Say I have a table like this: > > date | user > ------------------+--------------------- > 20050201 | Bill > 20050210 | Steve > 20050224 | Sally > 20050311 | Martha > 20050316 | Ryan > 20050322 | Phil > 20050330 | William > 20050415 | Mary > 20050428 | Susan > 20050503 | Jim > > and I want to run a query that returns a *count* of the number of users > *each month*, ordered by year and *month*, with an additional column > that is a *running total of the count*, as in: In 8.4, you'll have direct SQL support for this using OLAP a.k.a. windowing functions, so don't build too many of these dodgy hacks into your application. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Hi, I have a question concerning psql. I found that psql has a defined command '-t' and that it turns off printing of column names and result row count footers, etc. what I look for, is a command, which would turn off result row count footer, but would print column names. is there an easy way to do this? regards, Matt ---------------------------------------------------- Wirus Filipiński znowu atakuje http://klik.wp.pl/?adr=http%3A%2F%2Fprorocznia.pl%2Ff.html%3Fi%3D37160O-937882398O0&sid=575
> I have a question concerning psql. I found that psql has a defined > command '-t' and that it turns off printing of column names > and result > row count footers, etc. > > what I look for, is a command, which would turn off result row count > footer, but would print column names. > > is there an easy way to do this? Start psql with the -P footer switch. 8.3.3 works this way. Allan The material contained in this email may be confidential, privileged or copyrighted. If you are not the intended recipient,use, disclosure or copying of this information is prohibited. If you have received this document in error, pleaseadvise the sender and delete the document. Neither OneSteel nor the sender accept responsibility for any viruses containedin this email or any attachments.
>> I have a question concerning psql. I found that psql has a defined >> command '-t' and that it turns off printing of column names >> and result >> row count footers, etc. >> >> what I look for, is a command, which would turn off result row count >> footer, but would print column names. >> >> is there an easy way to do this? >Start psql with the -P footer switch. 8.3.3 works this way. > >Allan Works:) Thx a lot! Regards Matt