On 12/7/22 04:22, David G. Johnston wrote: > On Mon, Dec 5, 2022 at 10:40 PM Vik Fearing <vik@postgresfriends.org> wrote: > >> On 12/6/22 05:57, David G. Johnston wrote: >>> On Mon, Dec 5, 2022 at 9:48 PM Vik Fearing <vik@postgresfriends.org> >> wrote: >>> >>>> I can imagine an optimization that would remove an ORDER BY clause >>>> because it isn't needed for any other aggregate. >>> >>> >>> I'm referring to the query: >>> >>> select any_value(v order by v) from (values (2),(1),(3)) as vals (v); >>> // produces 1, per the documented implementation-defined behavior. >> >> Implementation-dependent. It is NOT implementation-defined, per spec. > > I really don't care all that much about the spec here given that ORDER BY > in an aggregate call is non-spec.
Well, this is demonstrably wrong.
<array aggregate function> ::= ARRAY_AGG <left paren> <value expression> [ ORDER BY <sort specification list> ] <right paren>
Demoable only by you and a few others...
We should update our documentation - the source of SQL Standard knowledge for mere mortals.
"Note: The ability to specify both DISTINCT and ORDER BY in an aggregate function is a PostgreSQL extension."
Apparently only DISTINCT remains as our extension.
> You are de-facto creating a first_value aggregate (which is by definition > non-standard) whether you like it or not.
I am de jure creating an any_value aggregate (which is by definition standard) whether you like it or not.
Yes, both statements seem true. At least until we decide to start ignoring a user's explicit order by clause.
>> If you care about which value you get back, use something else. > > There isn't a "something else" to use so that isn't presently an option.
The query
SELECT proposed_first_value(x ORDER BY y) FROM ...
is equivalent to
SELECT (ARRAY_AGG(x ORDER BY y))[1] FROM ...
so I am not very sympathetic to your claim of "no other option".
Semantically, yes, in terms of performance, not so much, for any non-trivial sized group.
I'm done, and apologize for getting too emotionally invested in this. I hope to get others to voice enough +1s to get a first_value function into core along-side this one (which makes the above discussion either moot or deferred until there is a concrete use case for ignoring an explicit ORDER BY). If that doesn't happen, well, it isn't going to make or break us either way.