Re: [SQL] Problems with PARTITION BY with count() in window-func - Mailing list pgsql-sql

From Andreas Joseph Krogh
Subject Re: [SQL] Problems with PARTITION BY with count() in window-func
Date
Msg-id VisenaEmail.47.ec56caa56a402a62.15f9bdc4735@tc7-visena
Whole thread Raw
In response to [SQL] Problems with PARTITION BY with count() in window-func  (Andreas Joseph Krogh <andreas@visena.com>)
Responses Re: [SQL] Problems with PARTITION BY with count() in window-func
List pgsql-sql
På onsdag 08. november 2017 kl. 11:17:39, skrev Andreas Joseph Krogh <andreas@visena.com>:
Hi all.
 
I'm trying to count() all log-entries per activity per month in a separate column using count() over(partition by ...) but get an error I don't understand.
[snip]
select
      date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) AS month
    , log.activity_id   , count(log.entity_id) AS num_logs   , count(log.entity_id) OVER total_for_month_window AS total_for_month
FROM
    log_entry log
WHERE 1 = 1
      AND date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) BETWEEN '2017-01-01'::DATE AND '2017-05-01'::DATE
GROUP BY month, log.activity_id
WINDOW total_for_month_window AS (PARTITION BY date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE))

ORDER BY date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) ASC
;
 
Note the missing log.entity_id in GROUP BY. 
 
but this gives the error:
 
[42803] column "log.entity_id" must appear in the GROUP BY clause or be used in an aggregate function
Details
 
This query gives the desired results:
 
SELECT DISTINCT
      date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) AS month
    , log.activity_id   , count(log.entity_id) OVER(partition by date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE), log.activity_id) AS num_logs_per_activity   , count(log.entity_id) OVER (partition by date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE)) AS total_for_month
FROM
    log_entry log
ORDER BY date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) ASC, log.activity_id ASC
;
monthactivity_idnum_logs_per_activitytotal_for_month
2017-01-01 00:00:00.000000124
2017-01-01 00:00:00.000000224
2017-02-01 00:00:00.000000156
2017-02-01 00:00:00.000000216
 
 
But I'd like a solution without the DISTINCT, if one exists?
 
If I introduce a new column, logged_for, and want to list number of logged entries per person per activity I can use this:
 
SELECT DISTINCT
    date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) AS month
    , log.logged_for   , log.activity_id   , count(log.entity_id) OVER(partition by date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE), log.logged_for, log.activity_id) AS num_logs_per_person_for_activity   , count(log.entity_id) OVER (partition by date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE), log.logged_for) AS total_for_person_for_month
FROM
    log_entry log
ORDER BY date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) ASC
;
Which gives:
monthlogged_foractivity_idnum_logs_per_person_for_activitytotal_for_person_for_month
2017-01-01 00:00:00.0000005124
2017-01-01 00:00:00.0000005224
2017-01-01 00:00:00.0000006124
2017-01-01 00:00:00.0000006224
2017-02-01 00:00:00.0000005156
2017-02-01 00:00:00.0000005216
2017-02-01 00:00:00.0000006156
2017-02-01 00:00:00.0000006216
 
 
Is this the recommended way to do this, I mean - having "group by" in the "partition by" belonging to the OVER()-clause of the count-aggregates?
 
Thanks.
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 

pgsql-sql by date:

Previous
From: Andreas Joseph Krogh
Date:
Subject: [SQL] Problems with PARTITION BY with count() in window-func
Next
From: "MS (direkt)"
Date:
Subject: Re: [SQL] Problems with PARTITION BY with count() in window-func