Hi all.
I'm trying to count() all log-entries per activity per month in a separate column using count() over(partition by ...) but get an error I don't understand.
Here's the schema:
CREATE TABLE log_entry( entity_id SERIAL PRIMARY KEY, start_date DATE NOT NULL, activity_id BIGINT NOT NULL
);
INSERT INTO log_entry(start_date, activity_id)
VALUES ('2017-01-01', 1)
, ('2017-01-02', 1)
, ('2017-01-03', 2)
, ('2017-01-04', 2)
, ('2017-02-01', 1)
, ('2017-02-01', 2)
, ('2017-02-01', 1)
, ('2017-02-02', 1)
, ('2017-02-02', 1)
, ('2017-02-03', 1);
When issuing this query:
select
date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) AS month
, log.activity_id , count(log.entity_id) AS num_logs , count(log.entity_id) OVER total_for_month_window AS total_for_month
FROM
log_entry log
WHERE 1 = 1
AND date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) BETWEEN '2017-01-01'::DATE AND '2017-05-01'::DATE
GROUP BY month, log.activity_id, log.entity_id
WINDOW total_for_month_window AS (PARTITION BY date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE))
ORDER BY date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) ASC
;
I get this:
month | activity_id | num_logs | total_for_month |
---|
2017-01-01 00:00:00.000000 | 1 | 1 | 4 |
2017-01-01 00:00:00.000000 | 1 | 1 | 4 |
2017-01-01 00:00:00.000000 | 2 | 1 | 4 |
2017-01-01 00:00:00.000000 | 2 | 1 | 4 |
2017-02-01 00:00:00.000000 | 1 | 1 | 6 |
2017-02-01 00:00:00.000000 | 2 | 1 | 6 |
2017-02-01 00:00:00.000000 | 1 | 1 | 6 |
2017-02-01 00:00:00.000000 | 1 | 1 | 6 |
2017-02-01 00:00:00.000000 | 1 | 1 | 6 |
2017-02-01 00:00:00.000000 | 1 | 1 | 6 |
But what I really want is this result:
month | activity_id | num_logs | total_for_month |
2017-01-01 00:00:00.000000 | 1 | 2 | 4 |
2017-01-01 00:00:00.000000 | 2 | 2 | 4 |
2017-02-01 00:00:00.000000 | 1 | 5 | 6 |
2017-02-01 00:00:00.000000 | 2 | 1 | 6 |
that is, not having to group by log.entity_id, trying like this:
select
date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) AS month
, log.activity_id , count(log.entity_id) AS num_logs , count(log.entity_id) OVER total_for_month_window AS total_for_month
FROM
log_entry log
WHERE 1 = 1
AND date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) BETWEEN '2017-01-01'::DATE AND '2017-05-01'::DATE
GROUP BY month, log.activity_id
WINDOW total_for_month_window AS (PARTITION BY date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE))
ORDER BY date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) ASC
;
Note the missing log.entity_id in GROUP BY.
but this gives the error:
[42803] column "log.entity_id" must appear in the GROUP BY clause or be used in an aggregate function
Details
Any idea how to solve this?
Thanks.
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963