[HACKERS] Aggregate transition state merging vs. hypothetical set functions - Mailing list pgsql-hackers

From Tom Lane
Subject [HACKERS] Aggregate transition state merging vs. hypothetical set functions
Date
Msg-id 4834.1507849699@sss.pgh.pa.us
Whole thread Raw
Responses Re: [HACKERS] Aggregate transition state merging vs. hypothetical set functions
Re: [HACKERS] Aggregate transition state merging vs. hypothetical set functions
List pgsql-hackers
I started to look into fixing orderedsetaggs.c so that we could revert
52328727b, and soon found a rather nasty problem.  Although the plain
OSAs seem amenable to supporting multiple finalfn calls on the same
transition state, the "hypothetical set" functions are not at all.
What they do is to accumulate all the regular aggregate input into
a tuplesort, then add the direct arguments as an additional "hypothetical"
row, and finally sort the result.  There's no realistic way of undoing
the addition of the hypothetical row, so these finalfns simply can't
share tuplesort state.

You could imagine accumulating the regular input into a tuplestore
and then copying it into a tuplesort in each finalfn call.  But that
seems pretty icky, and I'm not sure it'd really be any more performant
than just keeping separate tuplesort objects for each aggregate.

So my conclusion is that we *must* teach nodeAgg.c not to merge
transition states for these functions.  But I do not want some
hard-wired rule that all and only hypothetical-set aggregates
are nonmergeable.  Our notion of an AGGKIND_HYPOTHETICAL aggregate
is mostly a syntactical one about requiring there to be direct
argument(s) matching the aggregated argument(s) in type.  I do not
think that that should translate to an assumption about how the
aggregate implementation works (indeed, the CREATE AGGREGATE man
page says in so many words that HYPOTHETICAL affects only type
matching, not runtime behavior).  Moreover, having seen this example,
I think it's likely that there are other cases in which it's more
trouble than it's worth for an aggregate to support merging of
transition states.

Therefore, I think we need to bite the bullet and provide an aggregate
property (CREATE AGGREGATE argument / pg_aggregate column) that tells
whether the aggregate supports transition state merging.  Likely this
should have been in the state-merging patch to begin with, but better
late than never.

The main thing that probably has to be hashed out before we can write
that patch is what the default should be for user-created aggregates.
I am inclined to think that we should err on the side of safety and
default it to false (no merge support).  You could argue that the
lack of complaints since 9.6 came out is sufficient evidence that
defaulting to true would be all right, but I'm not sure.

Comments?
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType
Next
From: David Rowley
Date:
Subject: Re: [HACKERS] Aggregate transition state merging vs. hypothetical set functions