Re: [HACKERS] Partition-wise aggregation/grouping - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Partition-wise aggregation/grouping
Date
Msg-id CA+TgmoauSDNKt50eekshqAbAD6b9=P_YSSQ9g4Lr1yaVq38OyQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Partition-wise aggregation/grouping  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
List pgsql-hackers
On Fri, Nov 17, 2017 at 7:24 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> + * this value should be multiplied with cpu_tuple_cost wherever applicable.
> + */
> +#define DEFAULT_APPEND_COST_FACTOR 0.5
>
> I am wondering whether we should just define
> #define APPEND_TUPLE_COST (cpu_tuple_cost * 0.5)
> and use this macro everywhere. What else use DEFAULT_APPEND_COST_FACTOR would
> have other than multiplying with cpu_tuple_cost?

-1.  If you wrap it in a macro like that, future readers of the code
will have to go look up what the macro does.  If you just multiply by
DEFAULT_APPEND_COST_FACTOR it will be clear that's what being used is
a multiple of cpu_tuple_cost.

> I am not sure whether we should be discussing why this technique performs
> better or when it performs better. We don't have similar discussion for
> partition-wise join. That paragraph just describes the technique and may be we
> want to do the same here.

+1.

> + * might be optimal because of presence of suitable paths with pathkeys or
> + * because the hash tables for most of the partitions fit into the memory.
> + * However, if partition keys are not part of the group by clauses, then we
> + * still able to compute the partial aggregation for each partition and then
> + * finalize them over append. This can either win or lose. It may win if the
> + * PartialAggregate stage greatly reduces the number of groups and lose if we
> + * have lots of small groups.
>
> I have not seen prologue of a function implementing a query optimization
> technique explain why that technique improves performance. So I am not sure
> whether the comment should include this explanation. One of the reasons being
> that the reasons why a technique works might change over the period of time
> with the introduction of other techniques, thus obsoleting the comment.  But
> may be it's good to have it here.

+1 for keeping it.

> +        extra.inputRows = 0;    /* Not needed at child paths creation */
>
> Why? Comment should be on its own line.

Comments on same line are fine if they are short enough.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: David CARLIER
Date:
Subject: [PATCH] using arc4random for strong randomness matters.
Next
From: Robert Haas
Date:
Subject: Re: feature request: consume asynchronous notification via a function