On 12/6/22 05:22, David G. Johnston wrote: > On Mon, Dec 5, 2022 at 8:46 PM Vik Fearing <vik@postgresfriends.org> wrote: > >> On 12/5/22 18:56, David G. Johnston wrote: >>> Also, maybe we should have any_value do something like compute a 50/50 >>> chance that any new value seen replaces the existing chosen value, >> instead >>> of simply returning the first value all the time. Maybe even prohibit >> the >>> first value from being chosen so long as a second value appears. >> >> The spec says the result is implementation-dependent meaning we don't >> even need to document how it is obtained, but surely behavior like this >> would preclude future optimizations like the ones I mentioned? >> > > So, given the fact that we don't actually want to name a function > first_value (because some users are readily confused as to when the concept > of first is actually valid or not) but some users do actually wish for this > functionality - and you are proposing to implement it here anyway - how > about we actually do document that we promise to return the first non-null > value encountered by the aggregate. We can then direct people to this > function and just let them know to pretend the function is really named > first_value in the case where they specify an order by. (last_value comes > for basically free with descending sorting).
I can imagine an optimization that would remove an ORDER BY clause because it isn't needed for any other aggregate.
I'm referring to the query:
select any_value(v order by v) from (values (2),(1),(3)) as vals (v);
// produces 1, per the documented implementation-defined behavior.
Someone writing:
select any_value(v) from (values (2),(1),(3)) as vals (v) order by v;
Is not presently, nor am I saying, promised the value 1.
I'm assuming you are thinking of the second query form, while the guarantee only needs to apply to the first.