Thread: [QUESTION] Window function with partition by and order by

[QUESTION] Window function with partition by and order by

From
Ankit Kumar Pandey
Date:
Hello,

While looking at aggregates in window function, I found something 
unusual and would be glad I could get some clarification.

Consider following table (mytable):

id, name

1, A

1, A

2, B

3, A

1, A


select *, avg(id) over (partition by name, order by id) from mytable;

Output:

id, name, avg

1, A, 1

1, A, 1

1, A, 1

3, A, 1.5

2, B, 2


Question is: Average of id for partition name (A) should be 6/4 = 1.5 
for all rows in that partition but this result is seen only at the last 
one row in partition (A). Am I missing here something?


Thanks


-- 
Regards,
Ankit Kumar Pandey




Re: [QUESTION] Window function with partition by and order by

From
William Alves Da Silva
Date:
Hello Ankit.

This behavior is correct. This is because you are using ORDER BY in your aggregation function.

Looking at the documentation you will find the following quote:
"You can also control the order in which rows are processed by window functions by using ORDER BY within OVER. (The window ORDER BY doesn't even have to match the order in which the rows are produced)."

So, if you use ORDER BY you are controlling how the row are processed.

If you don't use, the result is like this:

postgres=# select *, avg(id) over (partition by name) from my_teste ;
 id | name | avg
----+------+--------------------
 1 | A | 1.5000000000000000
 1 | A | 1.5000000000000000
 3 | A | 1.5000000000000000
 1 | A | 1.5000000000000000
 2 | B | 2.0000000000000000
(5 rows) 


--

Regards,

William Alves

On 27 Nov 2022 13:08 -0300, Ankit Kumar Pandey <itsankitkp@gmail.com>, wrote:
Hello,

While looking at aggregates in window function, I found something
unusual and would be glad I could get some clarification.

Consider following table (mytable):

id, name

1, A

1, A

2, B

3, A

1, A


select *, avg(id) over (partition by name, order by id) from mytable;

Output:

id, name, avg

1, A, 1

1, A, 1

1, A, 1

3, A, 1.5

2, B, 2


Question is: Average of id for partition name (A) should be 6/4 = 1.5
for all rows in that partition but this result is seen only at the last
one row in partition (A). Am I missing here something?


Thanks


--
Regards,
Ankit Kumar Pandey



Re: [QUESTION] Window function with partition by and order by

From
Samed YILDIRIM
Date:
Hello Ankit,

It is absolutely expected behaviour of a window function with ORDER BY clause. The default frame clause of window definition is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. If you add an ORDER BY clause in a window definition, PostgreSQL takes the current row and all rows before it within the partition into calculation. If you don't add, it means all rows within the partition are peers, and PostgreSQL uses all rows for calculation. I'm putting the related part from the documentation and its link below.

The default framing option is RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW; it sets the frame to be all rows from the partition start up through the current row's last peer (a row that the window's ORDER BY clause considers equivalent to the current row; all rows are peers if there is no ORDER BY).
 

Best regards.
Samed YILDIRIM


On Sun, 27 Nov 2022 at 18:08, Ankit Kumar Pandey <itsankitkp@gmail.com> wrote:
Hello,

While looking at aggregates in window function, I found something
unusual and would be glad I could get some clarification.

Consider following table (mytable):

id, name

1, A

1, A

2, B

3, A

1, A


select *, avg(id) over (partition by name, order by id) from mytable;

Output:

id, name, avg

1, A, 1

1, A, 1

1, A, 1

3, A, 1.5

2, B, 2


Question is: Average of id for partition name (A) should be 6/4 = 1.5
for all rows in that partition but this result is seen only at the last
one row in partition (A). Am I missing here something?


Thanks


--
Regards,
Ankit Kumar Pandey



Re: [QUESTION] Window function with partition by and order by

From
Tom Lane
Date:
Ankit Kumar Pandey <itsankitkp@gmail.com> writes:
> Question is: Average of id for partition name (A) should be 6/4 = 1.5 
> for all rows in that partition but this result is seen only at the last 
> one row in partition (A). Am I missing here something?

https://www.postgresql.org/docs/current/tutorial-window.html

particularly:

    By default, if ORDER BY is supplied then the frame consists of all
    rows from the start of the partition up through the current row, plus
    any following rows that are equal to the current row according to the
    ORDER BY clause.

            regards, tom lane



Re: [QUESTION] Window function with partition by and order by

From
Ankit Kumar Pandey
Date:
On 27/11/22 21:58, Tom Lane wrote:
> Ankit Kumar Pandey <itsankitkp@gmail.com> writes:
>> Question is: Average of id for partition name (A) should be 6/4 = 1.5
>> for all rows in that partition but this result is seen only at the last
>> one row in partition (A). Am I missing here something?
> https://www.postgresql.org/docs/current/tutorial-window.html
>
> particularly:
>
>      By default, if ORDER BY is supplied then the frame consists of all
>      rows from the start of the partition up through the current row, plus
>      any following rows that are equal to the current row according to the
>      ORDER BY clause.
>
>             regards, tom lane
Thanks, I understand the behavior now.

-- 
Regards,
Ankit Kumar Pandey




Re: [QUESTION] Window function with partition by and order by

From
Ankit Kumar Pandey
Date:
On 27/11/22 21:53, Samed YILDIRIM wrote:
Hello Ankit,

It is absolutely expected behaviour of a window function with ORDER BY clause. The default frame clause of window definition is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. If you add an ORDER BY clause in a window definition, PostgreSQL takes the current row and all rows before it within the partition into calculation. If you don't add, it means all rows within the partition are peers, and PostgreSQL uses all rows for calculation. I'm putting the related part from the documentation and its link below.

The default framing option is RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW; it sets the frame to be all rows from the partition start up through the current row's last peer (a row that the window's ORDER BY clause considers equivalent to the current row; all rows are peers if there is no ORDER BY).
 

Best regards.
Samed YILDIRIM


On Sun, 27 Nov 2022 at 18:08, Ankit Kumar Pandey <itsankitkp@gmail.com> wrote:
Hello,

While looking at aggregates in window function, I found something
unusual and would be glad I could get some clarification.

Consider following table (mytable):

id, name

1, A

1, A

2, B

3, A

1, A


select *, avg(id) over (partition by name, order by id) from mytable;

Output:

id, name, avg

1, A, 1

1, A, 1

1, A, 1

3, A, 1.5

2, B, 2


Question is: Average of id for partition name (A) should be 6/4 = 1.5
for all rows in that partition but this result is seen only at the last
one row in partition (A). Am I missing here something?


Thanks


--
Regards,
Ankit Kumar Pandey



Thanks, this makes sense.
-- 
Regards,
Ankit Kumar Pandey

Re: [QUESTION] Window function with partition by and order by

From
Ankit Kumar Pandey
Date:


On 27/11/22 21:53, Samed YILDIRIM wrote:
Hello Ankit,

It is absolutely expected behaviour of a window function with ORDER BY clause. The default frame clause of window definition is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. If you add an ORDER BY clause in a window definition, PostgreSQL takes the current row and all rows before it within the partition into calculation. If you don't add, it means all rows within the partition are peers, and PostgreSQL uses all rows for calculation. I'm putting the related part from the documentation and its link below.

The default framing option is RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW; it sets the frame to be all rows from the partition start up through the current row's last peer (a row that the window's ORDER BY clause considers equivalent to the current row; all rows are peers if there is no ORDER BY).
 

Best regards.
Samed YILDIRIM


On Sun, 27 Nov 2022 at 18:08, Ankit Kumar Pandey <itsankitkp@gmail.com> wrote:
Hello,

While looking at aggregates in window function, I found something
unusual and would be glad I could get some clarification.

Consider following table (mytable):

id, name

1, A

1, A

2, B

3, A

1, A


select *, avg(id) over (partition by name, order by id) from mytable;

Output:

id, name, avg

1, A, 1

1, A, 1

1, A, 1

3, A, 1.5

2, B, 2


Question is: Average of id for partition name (A) should be 6/4 = 1.5
for all rows in that partition but this result is seen only at the last
one row in partition (A). Am I missing here something?


Thanks


--
Regards,
Ankit Kumar Pandey


Okay, I understand this now.

-- 
Regards,
Ankit Kumar Pandey