[SQL] Problems with PARTITION BY with count() in window-func - Mailing list pgsql-sql

From Andreas Joseph Krogh
Subject [SQL] Problems with PARTITION BY with count() in window-func
Date
Msg-id VisenaEmail.2f.7083bea2d9429cac.15f9b162e38@tc7-visena
Whole thread Raw
Responses Re: [SQL] Problems with PARTITION BY with count() in window-func
List pgsql-sql
Hi all.
 
I'm trying to count() all log-entries per activity per month in a separate column using count() over(partition by ...) but get an error I don't understand.
 
Here's the schema:
 
CREATE TABLE log_entry(   entity_id SERIAL PRIMARY KEY,   start_date DATE NOT NULL,   activity_id BIGINT NOT NULL
);

INSERT INTO log_entry(start_date, activity_id)
VALUES ('2017-01-01', 1)
, ('2017-01-02', 1)
, ('2017-01-03', 2)
, ('2017-01-04', 2)
, ('2017-02-01', 1)
, ('2017-02-01', 2)
, ('2017-02-01', 1)
, ('2017-02-02', 1)
, ('2017-02-02', 1)
, ('2017-02-03', 1);



When issuing this query:


select
      date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) AS month
    , log.activity_id   , count(log.entity_id) AS num_logs   , count(log.entity_id) OVER total_for_month_window AS total_for_month
FROM
    log_entry log
WHERE 1 = 1
      AND date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) BETWEEN '2017-01-01'::DATE AND '2017-05-01'::DATE
GROUP BY month, log.activity_id, log.entity_id
WINDOW total_for_month_window AS (PARTITION BY date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE))

ORDER BY date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) ASC
;
 
I get this:
 
 
monthactivity_idnum_logstotal_for_month
2017-01-01 00:00:00.000000114
2017-01-01 00:00:00.000000114
2017-01-01 00:00:00.000000214
2017-01-01 00:00:00.000000214
2017-02-01 00:00:00.000000116
2017-02-01 00:00:00.000000216
2017-02-01 00:00:00.000000116
2017-02-01 00:00:00.000000116
2017-02-01 00:00:00.000000116
2017-02-01 00:00:00.000000116
 
 
 
But what I really want is this result:
 
monthactivity_idnum_logstotal_for_month
2017-01-01 00:00:00.000000124
2017-01-01 00:00:00.000000224
2017-02-01 00:00:00.000000156
2017-02-01 00:00:00.000000216
 
 
that is, not having to group by log.entity_id, trying like this:
 

select
      date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) AS month
    , log.activity_id   , count(log.entity_id) AS num_logs   , count(log.entity_id) OVER total_for_month_window AS total_for_month
FROM
    log_entry log
WHERE 1 = 1
      AND date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) BETWEEN '2017-01-01'::DATE AND '2017-05-01'::DATE
GROUP BY month, log.activity_id
WINDOW total_for_month_window AS (PARTITION BY date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE))

ORDER BY date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) ASC
;
 
Note the missing log.entity_id in GROUP BY. 
 
but this gives the error:
 
[42803] column "log.entity_id" must appear in the GROUP BY clause or be used in an aggregate function
Details
 
 
Any idea how to solve this?
 
Thanks.
 
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963

pgsql-sql by date:

Previous
From: Achilleas Mantzios
Date:
Subject: Re: [SQL] join tables by nearest timestamp
Next
From: Andreas Joseph Krogh
Date:
Subject: Re: [SQL] Problems with PARTITION BY with count() in window-func