Thread: Window functions: frame-adhering aggregate without ORDER BY clause

Window functions: frame-adhering aggregate without ORDER BY clause

From
Romain Carl
Date:
Hi Listers,

among the window tests (src/test/regress/expected/window.out), I noticed 
the presence of tests that rely upon the order of rows not determined by 
any ORDER BY clause, such as:

SELECT sum(unique1) over (rows between 2 preceding and 2 following 
exclude no others),
     unique1, four
FROM tenk1 WHERE unique1 < 10;

Expected result:

  sum | unique1 | four
-------+-----------+------
       7 |            4 |    0
     13 |            2 |    2
     22 |            1 |    1
     26 |            6 |    2
     29 |            9 |    1
     31 |            8 |    0
     32 |            5 |    1
     23 |            3 |    3
     15 |            7 |    3
     10 |            0 |    0
(10 rows)

The current row's frame and, consequently, the result of the sum 
aggregate depend on the order produced by the sequential scan of table 
tenk1. Since such order is, in general, not part of PG's defined 
behavior, what purpose do the tests that rely upon it serve?

Following up to that, how is an EXCLUDE GROUP defined to behave in 
absence of any ORDER BY clause? It seems to exclude the entire window 
frame according to this test:

SELECT sum(unique1) over (rows between 2 preceding and 2 following 
exclude group),
     unique1, four
FROM tenk1 WHERE unique1 < 10;

Expected result:

  sum | unique1 | four
-------+-----------+------
         |            4 |    0
         |            2 |    2
         |            1 |    1
         |            6 |    2
         |            9 |    1
         |            8 |    0
         |            5 |    1
         |            3 |    3
         |            7 |    3
         |            0 |    0
(10 rows)

Thanks in advance and best regards,
Romain




Romain Carl <romaincarl@aol.com> writes:
> among the window tests (src/test/regress/expected/window.out), I noticed 
> the presence of tests that rely upon the order of rows not determined by 
> any ORDER BY clause, such as:

Yeah ...

> The current row's frame and, consequently, the result of the sum 
> aggregate depend on the order produced by the sequential scan of table 
> tenk1. Since such order is, in general, not part of PG's defined 
> behavior, what purpose do the tests that rely upon it serve?

The tests are perfectly entitled to test PG's actual behavior.
I don't see much difference between this particular case and the
fact that we have any tests at all that lack ORDER BY, because
formally speaking the engine could choose to emit the rows in
some other order.  In practice, if we ever did make the engine
behave differently, it'd be on us to fix affected test cases.

> Following up to that, how is an EXCLUDE GROUP defined to behave in 
> absence of any ORDER BY clause?

I see in the docs

    <literal>EXCLUDE GROUP</literal> excludes the current row and its
    ordering peers from the frame.

and a bit later

    Without <literal>ORDER BY</literal>,
    ... all rows become peers of the current row.

so excluding the whole frame seems like the right behavior.

            regards, tom lane



Re: Window functions: frame-adhering aggregate without ORDER BY clause

From
Romain Carl
Date:
Alright, this makes sense. Thank you for the quick response!

Best regards,
Romain Carl

On 26.06.23 15:54, Tom Lane wrote:
> Romain Carl <romaincarl@aol.com> writes:
>> among the window tests (src/test/regress/expected/window.out), I noticed
>> the presence of tests that rely upon the order of rows not determined by
>> any ORDER BY clause, such as:
> Yeah ...
>
>> The current row's frame and, consequently, the result of the sum
>> aggregate depend on the order produced by the sequential scan of table
>> tenk1. Since such order is, in general, not part of PG's defined
>> behavior, what purpose do the tests that rely upon it serve?
> The tests are perfectly entitled to test PG's actual behavior.
> I don't see much difference between this particular case and the
> fact that we have any tests at all that lack ORDER BY, because
> formally speaking the engine could choose to emit the rows in
> some other order.  In practice, if we ever did make the engine
> behave differently, it'd be on us to fix affected test cases.
>
>> Following up to that, how is an EXCLUDE GROUP defined to behave in
>> absence of any ORDER BY clause?
> I see in the docs
>
>      <literal>EXCLUDE GROUP</literal> excludes the current row and its
>      ordering peers from the frame.
>
> and a bit later
>
>      Without <literal>ORDER BY</literal>,
>      ... all rows become peers of the current row.
>
> so excluding the whole frame seems like the right behavior.
>
>             regards, tom lane



Re: Window functions: frame-adhering aggregate without ORDER BY clause

From
Erik Brandsberg
Date:
I would add to this that tests that ensure that undocumented behaviors are consistent are a good thing.  I have seen many cases in my life where changing such behaviors will trigger breakage in applications that (unfortunately) depend on them.  As such, by having the tests, it ensures that someone has to make a decision if they are broken, and decide if it is worth the risk.  

On Mon, Jun 26, 2023 at 11:14 AM Romain Carl <romaincarl@aol.com> wrote:
Alright, this makes sense. Thank you for the quick response!

Best regards,
Romain Carl

On 26.06.23 15:54, Tom Lane wrote:
> Romain Carl <romaincarl@aol.com> writes:
>> among the window tests (src/test/regress/expected/window.out), I noticed
>> the presence of tests that rely upon the order of rows not determined by
>> any ORDER BY clause, such as:
> Yeah ...
>
>> The current row's frame and, consequently, the result of the sum
>> aggregate depend on the order produced by the sequential scan of table
>> tenk1. Since such order is, in general, not part of PG's defined
>> behavior, what purpose do the tests that rely upon it serve?
> The tests are perfectly entitled to test PG's actual behavior.
> I don't see much difference between this particular case and the
> fact that we have any tests at all that lack ORDER BY, because
> formally speaking the engine could choose to emit the rows in
> some other order.  In practice, if we ever did make the engine
> behave differently, it'd be on us to fix affected test cases.
>
>> Following up to that, how is an EXCLUDE GROUP defined to behave in
>> absence of any ORDER BY clause?
> I see in the docs
>
>      <literal>EXCLUDE GROUP</literal> excludes the current row and its
>      ordering peers from the frame.
>
> and a bit later
>
>      Without <literal>ORDER BY</literal>,
>      ... all rows become peers of the current row.
>
> so excluding the whole frame seems like the right behavior.
>
>                       regards, tom lane