Re: Rethinking representation of partial-aggregate steps - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Rethinking representation of partial-aggregate steps
Date
Msg-id 31354.1466702701@sss.pgh.pa.us
Whole thread Raw
In response to Re: Rethinking representation of partial-aggregate steps  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Rethinking representation of partial-aggregate steps  (Robert Haas <robertmhaas@gmail.com>)
Re: Rethinking representation of partial-aggregate steps  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: Rethinking representation of partial-aggregate steps  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Jun 23, 2016 at 12:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm inclined to think that we should aggressively simplify here, perhaps
>> replacing all three of these flags with a three-way enum, say
>> 
>> PARTIALAGG_NORMAL       /* simple aggregation */
>> PARTIALAGG_PARTIAL      /* lower phase of partial aggregation */
>> PARTIALAGG_FINAL        /* upper phase of partial aggregation */
>> 
>> This embodies a couple of judgments that maybe someone would quibble with:
>> * we don't have any use for intermediate partial aggregation steps;
>> * we don't have any use for partial aggregation in which data transferred
>> up needn't be serialized.

> Hmm, well I guess I would have to disagree with the idea that those
> other modes aren't useful. I seem to recall that David had some
> performance results showing that pushing partial aggregate steps below
> an Append node resulted in a noticeable speed-up even in the absence
> of any parallelism, presumably because it avoids whatever projection
> the Append might do, and also improves icache and dcache locality.

I don't believe that for one second, because introducing another layer of
intermediate aggregation implies another projection step, plus all the
other overhead of a Plan node.

Also, if we are going to support intermediate partial aggregation, the
current representation is broken anyway, since it lacks any ability
to distinguish serialization for the input states from serialization
for the output states, which is something you'd certainly need to
distinguish in this type of plan structure.

> I do agree, however, that the three Boolean flags don't make the code
> entirely easy to read.  What I might suggest is that we replace the
> three Boolean flags with integer flags, something like this:

Yeah, that's another way we could go.  I had been considering a variant
of that, which was to assign specific code values to the enum constants
and then invent macros that did bit-anding tests on them.  That ends up
being just about what you propose except that the compiler understands
the enum-ness of the behavioral alternatives, which seems like a good
thing.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Bug in to_timestamp().
Next
From: Robert Haas
Date:
Subject: Re: Hash Indexes