Thread: Window functions: frame-adhering aggregate without ORDER BY clause
Hi Listers, among the window tests (src/test/regress/expected/window.out), I noticed the presence of tests that rely upon the order of rows not determined by any ORDER BY clause, such as: SELECT sum(unique1) over (rows between 2 preceding and 2 following exclude no others), unique1, four FROM tenk1 WHERE unique1 < 10; Expected result: sum | unique1 | four -------+-----------+------ 7 | 4 | 0 13 | 2 | 2 22 | 1 | 1 26 | 6 | 2 29 | 9 | 1 31 | 8 | 0 32 | 5 | 1 23 | 3 | 3 15 | 7 | 3 10 | 0 | 0 (10 rows) The current row's frame and, consequently, the result of the sum aggregate depend on the order produced by the sequential scan of table tenk1. Since such order is, in general, not part of PG's defined behavior, what purpose do the tests that rely upon it serve? Following up to that, how is an EXCLUDE GROUP defined to behave in absence of any ORDER BY clause? It seems to exclude the entire window frame according to this test: SELECT sum(unique1) over (rows between 2 preceding and 2 following exclude group), unique1, four FROM tenk1 WHERE unique1 < 10; Expected result: sum | unique1 | four -------+-----------+------ | 4 | 0 | 2 | 2 | 1 | 1 | 6 | 2 | 9 | 1 | 8 | 0 | 5 | 1 | 3 | 3 | 7 | 3 | 0 | 0 (10 rows) Thanks in advance and best regards, Romain
Romain Carl <romaincarl@aol.com> writes: > among the window tests (src/test/regress/expected/window.out), I noticed > the presence of tests that rely upon the order of rows not determined by > any ORDER BY clause, such as: Yeah ... > The current row's frame and, consequently, the result of the sum > aggregate depend on the order produced by the sequential scan of table > tenk1. Since such order is, in general, not part of PG's defined > behavior, what purpose do the tests that rely upon it serve? The tests are perfectly entitled to test PG's actual behavior. I don't see much difference between this particular case and the fact that we have any tests at all that lack ORDER BY, because formally speaking the engine could choose to emit the rows in some other order. In practice, if we ever did make the engine behave differently, it'd be on us to fix affected test cases. > Following up to that, how is an EXCLUDE GROUP defined to behave in > absence of any ORDER BY clause? I see in the docs <literal>EXCLUDE GROUP</literal> excludes the current row and its ordering peers from the frame. and a bit later Without <literal>ORDER BY</literal>, ... all rows become peers of the current row. so excluding the whole frame seems like the right behavior. regards, tom lane
Alright, this makes sense. Thank you for the quick response! Best regards, Romain Carl On 26.06.23 15:54, Tom Lane wrote: > Romain Carl <romaincarl@aol.com> writes: >> among the window tests (src/test/regress/expected/window.out), I noticed >> the presence of tests that rely upon the order of rows not determined by >> any ORDER BY clause, such as: > Yeah ... > >> The current row's frame and, consequently, the result of the sum >> aggregate depend on the order produced by the sequential scan of table >> tenk1. Since such order is, in general, not part of PG's defined >> behavior, what purpose do the tests that rely upon it serve? > The tests are perfectly entitled to test PG's actual behavior. > I don't see much difference between this particular case and the > fact that we have any tests at all that lack ORDER BY, because > formally speaking the engine could choose to emit the rows in > some other order. In practice, if we ever did make the engine > behave differently, it'd be on us to fix affected test cases. > >> Following up to that, how is an EXCLUDE GROUP defined to behave in >> absence of any ORDER BY clause? > I see in the docs > > <literal>EXCLUDE GROUP</literal> excludes the current row and its > ordering peers from the frame. > > and a bit later > > Without <literal>ORDER BY</literal>, > ... all rows become peers of the current row. > > so excluding the whole frame seems like the right behavior. > > regards, tom lane
I would add to this that tests that ensure that undocumented behaviors are consistent are a good thing. I have seen many cases in my life where changing such behaviors will trigger breakage in applications that (unfortunately) depend on them. As such, by having the tests, it ensures that someone has to make a decision if they are broken, and decide if it is worth the risk.
On Mon, Jun 26, 2023 at 11:14 AM Romain Carl <romaincarl@aol.com> wrote:
Alright, this makes sense. Thank you for the quick response!
Best regards,
Romain Carl
On 26.06.23 15:54, Tom Lane wrote:
> Romain Carl <romaincarl@aol.com> writes:
>> among the window tests (src/test/regress/expected/window.out), I noticed
>> the presence of tests that rely upon the order of rows not determined by
>> any ORDER BY clause, such as:
> Yeah ...
>
>> The current row's frame and, consequently, the result of the sum
>> aggregate depend on the order produced by the sequential scan of table
>> tenk1. Since such order is, in general, not part of PG's defined
>> behavior, what purpose do the tests that rely upon it serve?
> The tests are perfectly entitled to test PG's actual behavior.
> I don't see much difference between this particular case and the
> fact that we have any tests at all that lack ORDER BY, because
> formally speaking the engine could choose to emit the rows in
> some other order. In practice, if we ever did make the engine
> behave differently, it'd be on us to fix affected test cases.
>
>> Following up to that, how is an EXCLUDE GROUP defined to behave in
>> absence of any ORDER BY clause?
> I see in the docs
>
> <literal>EXCLUDE GROUP</literal> excludes the current row and its
> ordering peers from the frame.
>
> and a bit later
>
> Without <literal>ORDER BY</literal>,
> ... all rows become peers of the current row.
>
> so excluding the whole frame seems like the right behavior.
>
> regards, tom lane