Re: ANY_VALUE aggregate - Mailing list pgsql-hackers

From Vik Fearing
Subject Re: ANY_VALUE aggregate
Date
Msg-id 15c31fa6-b84a-8705-f5a2-987d165a89db@postgresfriends.org
Whole thread Raw
In response to Re: ANY_VALUE aggregate  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: ANY_VALUE aggregate
Re: ANY_VALUE aggregate
List pgsql-hackers
On 1/23/23 08:50, David Rowley wrote:
> On Thu, 19 Jan 2023 at 06:01, Vik Fearing <vik@postgresfriends.org> wrote:
>> Thank you for the review.  Attached is a new version rebased to d540a02a72.
> 
> I've only a bunch of nit-picks, personal preferences and random
> thoughts to offer as a review:
> 
> 1. I'd be inclined *not* to mention the possible future optimisation in:
> 
> + * Currently this just returns the first value, but in the future it might be
> + * able to signal to the aggregate that it does not need to be called anymore.
> 
> I think it's unlikely that the transfn would "signal" such a thing. It
> seems more likely if we did anything about it that nodeAgg.c would
> maybe have some additional knowledge not to call that function if the
> agg state already has a value. Just so we're not preempting how we
> might do such a thing in the future, it seems best just to remove the
> mention of it. I don't really think it serves as a good reminder that
> we might want to do this one day anyway.

Modified.  My logic in having the transition function signal that it is 
finished is to one day allow something like:

   array_agg(x order by y limit z)

> 2. +any_value_trans(PG_FUNCTION_ARGS)
> 
> Many of transition function names end in "transfn", not "trans". I
> think it's better to follow the existing (loosely followed) naming
> pattern that a few aggregates seem to follow rather than invent a new
> one.

Renamed.

> 3. I tend to try to copy the capitalisation of keywords from the
> surrounding regression tests. I see the following breaks that.
> 
> +SELECT any_value(v) FILTER (WHERE v > 2) FROM (VALUES (1), (2), (3)) AS v (v);
> 
> (obviously, ideally, we'd always just follow the same capitalisation
> of keywords everywhere in each .sql file, but we've long broken that
> and the best way can do is be consistent with surrounding tests)

Downcased.

> 4. I think I'd use the word "Returns" instead of "Chooses" in:
> 
> +        Chooses a non-deterministic value from the non-null input values.

Done.

> 5. I've not managed to find a copy of the 2023 draft, so I'm assuming
> you've got the ignoring of NULLs correct.

Yes, I do.  This is part of <computational operation>, so SQL:2016 10.9 
GR 7.a applies.

> 6. Is it worth adding a WindowFunc test somewhere in window.sql with
> an any_value(...) over (...)?  Is what any_value() returns as a
> WindowFunc equally as non-deterministic as when it's used as an
> Aggref? Can we assume there's no guarantee that it'll return the same
> value for each partition in each row? Does the spec mention anything
> about that?

This is governed by SQL:2016 10.9 GR 1.d and 1.e which defines the 
source rows for the aggregate: either a group or a window frame.  There 
is no difference in behavior.  I don't think a windowed test is useful 
here unless I were to implement moving transitions.  I think that might 
be overkill for this function.

> 7. I wondered if it's worth adding a
> SupportRequestOptimizeWindowClause support function for this
> aggregate. I'm thinking that it might not be as likely people would
> use something more specific like first_value/nth_value/last_value
> instead of using any_value as a WindowFunc. Also, I'm currently
> thinking that a SupportRequestWFuncMonotonic for any_value() is not
> worth the dozen or so lines of code it would take to write it. I'm
> assuming it would always be a MONOTONICFUNC_BOTH function. It seems
> unlikely that someone would have a subquery with a WHERE clause in the
> upper-level query referencing the any_value() aggregate.  Thought I'd
> mention both of these things anyway as someone else might think of
> some good reason we should add them that I didn't think of.

I thought about this for a while and decided that it was not worthwhile.

v4 attached.  I am putting this back to Needs Review in the commitfest 
app, but these changes were editorial so it is probably RfC like Peter 
had set it.  I will let you be the judge of that.
-- 
Vik Fearing

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Improve logging when using Huge Pages
Next
From: Ajin Cherian
Date:
Subject: Re: Support logical replication of DDLs