Re: Add proper planner support for ORDER BY / DISTINCT aggregates - Mailing list pgsql-hackers

From Dean Rasheed
Subject Re: Add proper planner support for ORDER BY / DISTINCT aggregates
Date
Msg-id CAEZATCVG8SZhxr6JDnPH2ktgDznktcNitZBn9dB7mb20+f4GRw@mail.gmail.com
Whole thread Raw
In response to Re: Add proper planner support for ORDER BY / DISTINCT aggregates  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Add proper planner support for ORDER BY / DISTINCT aggregates
List pgsql-hackers
On Wed, 18 Jan 2023 at 09:49, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Wed, 18 Jan 2023 at 22:37, Richard Guo <guofenglinux@gmail.com> wrote:
> > I'm still confused that when the same scenario happens with ORDER BY in
> > an aggregate function, like in query 1, the result is different in that
> > we would get an unsorted output.
> >
> > I wonder if we should avoid this inconsistent behavior.
>
> It certainly seems pretty strange that aggregates with an ORDER BY
> behave differently from the query's ORDER BY. I'd have expected that
> to be the same.  I've not looked to see why there's a difference, but
> suspect that we thought about how we want it to work for the query's
> ORDER BY and when ORDER BY aggregates were added, that behaviour was
> not considered.
>

I think the behaviour of an ORDER BY in the query can also be pretty
surprising. For example, consider:

SELECT ARRAY[random(), random(), random()]
FROM generate_series(1, 3);

                            array
-------------------------------------------------------------
 {0.2335800863701647,0.14688842754711273,0.2975659224823368}
 {0.10616525384762876,0.8371175798972244,0.2936178886154661}
 {0.21679841321788262,0.5254761982948826,0.7789412240118161}
(3 rows)

which produces 9 different random values, as expected, and compare that to:

SELECT ARRAY[random(), random(), random()]
FROM generate_series(1, 3)
ORDER BY random();

                             array
---------------------------------------------------------------
 {0.01952216253949679,0.01952216253949679,0.01952216253949679}
 {0.6735145595500629,0.6735145595500629,0.6735145595500629}
 {0.9406665780147616,0.9406665780147616,0.9406665780147616}
(3 rows)

which now only has 3 distinct random values. It's pretty
counterintuitive that adding an ORDER BY clause changes the contents
of the rows returned, not just their order.

The trouble is, if we tried to fix that, we'd risk changing some other
behaviour that users may have come to rely on.

Regards,
Dean



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: [DOCS] Stats views and functions not in order?
Next
From: Peter Eisentraut
Date:
Subject: Re: heapgettup refactoring