Re: Ordering behavior for aggregates - Mailing list pgsql-hackers

From Vik Fearing
Subject Re: Ordering behavior for aggregates
Date
Msg-id abb988eb-6a88-4b63-4703-36f9a7b97133@postgresfriends.org
Whole thread Raw
In response to Re: Ordering behavior for aggregates  (Magnus Hagander <magnus@hagander.net>)
Responses Re: Ordering behavior for aggregates  (Ronan Dunklau <ronan.dunklau@aiven.io>)
List pgsql-hackers
On 12/13/22 13:55, Magnus Hagander wrote:
> On Tue, Dec 13, 2022 at 1:51 PM Vik Fearing <vik@postgresfriends.org> wrote:
> 
>> The standard only defines an ORDER BY clause inside of an aggregate for
>> ARRAY_AGG().  As an extension to the standard, we allow it for all
>> aggregates, which is very convenient for non-standard things like
>> string_agg().
>>
>> However, it is completely useless for things like AVG() or SUM().  If
>> you include it, the aggregate will do the sort even though it is neither
>> required nor desired.
>>
>> I am proposing something like pg_aggregate.aggordering which would be an
>> enum of behaviors such as f=Forbidden, a=Allowed, r=Required.  Currently
>> all aggregates would have 'a' but I am thinking that a lot of them could
>> be switched to 'f'.  In that case, if a user supplies an ordering, an
>> error is raised.
>>
> 
> Should there perhaps also be an option for "ignored" where we'd allow the
> user to specify it, but not actually do the sort because we know it's
> pointless? Or maybe that should be the behaviour of "forbidden", which
> should then perhaps have a different name?


I did think about that but I can't think of any reason we would want to 
silently ignore something the user has written.  If the ordering doesn't 
make sense, we should forbid it.


> My main motivation behind this is to be able to optimize aggregates that
>> could stop early such as ANY_VALUE(), but also to self-optimize queries
>> written in error (or ignorance).
>>
>> There is recurring demand for a first_agg() of some sort, and that one
>> (whether implemented in core or left to extensions) would use 'r' so
>> that an error is raised if the user does not supply an ordering.
>>
>> I have not started working on this because I envision quite a lot of
>> bikeshedding, but this is the approach I am aiming for.
>>
>> Thoughts?
>>
> 
>   For consistency, should we have a similar flag for DISITNCT? That could be
> interesting to forbid for something like first_agg() wouldn't it? I'm not
> sure what the usecase would be to require it, but maybe there is one?


I thought about that too, but decided it could be a separate patch 
because far fewer aggregates would need it.
-- 
Vik Fearing




pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Ordering behavior for aggregates
Next
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Perform streaming logical transactions by background workers and parallel apply