Re: Ordering behavior for aggregates - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Ordering behavior for aggregates
Date
Msg-id CABUevExHK_s6Ow5B6S2Dr93JiQ11oTLQ5YsnrwcRS7MOZ+k+Tw@mail.gmail.com
Whole thread Raw
In response to Ordering behavior for aggregates  (Vik Fearing <vik@postgresfriends.org>)
Responses Re: Ordering behavior for aggregates  (Vik Fearing <vik@postgresfriends.org>)
List pgsql-hackers
On Tue, Dec 13, 2022 at 1:51 PM Vik Fearing <vik@postgresfriends.org> wrote:
The standard only defines an ORDER BY clause inside of an aggregate for
ARRAY_AGG().  As an extension to the standard, we allow it for all
aggregates, which is very convenient for non-standard things like
string_agg().

However, it is completely useless for things like AVG() or SUM().  If
you include it, the aggregate will do the sort even though it is neither
required nor desired.

I am proposing something like pg_aggregate.aggordering which would be an
enum of behaviors such as f=Forbidden, a=Allowed, r=Required.  Currently
all aggregates would have 'a' but I am thinking that a lot of them could
be switched to 'f'.  In that case, if a user supplies an ordering, an
error is raised.

Should there perhaps also be an option for "ignored" where we'd allow the user to specify it, but not actually do the sort because we know it's pointless? Or maybe that should be the behaviour of "forbidden", which should then perhaps have a different name?


My main motivation behind this is to be able to optimize aggregates that
could stop early such as ANY_VALUE(), but also to self-optimize queries
written in error (or ignorance).

There is recurring demand for a first_agg() of some sort, and that one
(whether implemented in core or left to extensions) would use 'r' so
that an error is raised if the user does not supply an ordering.

I have not started working on this because I envision quite a lot of
bikeshedding, but this is the approach I am aiming for.

Thoughts?

 For consistency, should we have a similar flag for DISITNCT? That could be interesting to forbid for something like first_agg() wouldn't it? I'm not sure what the usecase would be to require it, but maybe there is one?

--

pgsql-hackers by date:

Previous
From: Vik Fearing
Date:
Subject: Ordering behavior for aggregates
Next
From: Vik Fearing
Date:
Subject: Re: Ordering behavior for aggregates