Re: [HACKERS] Partition-wise aggregation/grouping - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Partition-wise aggregation/grouping
Date
Msg-id CA+TgmoYawC9Lg2kj8579FMcBONRbrqPb8JndGO309o6ArjrgJw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Partition-wise aggregation/grouping  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: [HACKERS] Partition-wise aggregation/grouping  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
List pgsql-hackers
On Thu, Mar 8, 2018 at 2:45 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> +        grouped_rel->part_scheme = input_rel->part_scheme;
> +        grouped_rel->nparts = nparts;
> +        grouped_rel->boundinfo = input_rel->boundinfo;
> +        grouped_rel->part_rels = part_rels;
>
> You need to set the part_exprs which will provide partition keys for this
> partitioned relation. I think, we should include all the part_exprs of
> input_rel which are part of GROUP BY clause. Since any other expressions in
> part_exprs are not part of GROUP BY clause, they can not appear in the
> targetlist without an aggregate on top. So they can't be part of the partition
> keys of the grouped relation.
>
> In create_grouping_paths() we fetch both partial as well as fully grouped rel
> for given input relation. But in case of partial aggregation, we don't need
> fully grouped rel since we are not computing full aggregates for the children.
> Since fetch_upper_rel() creates a relation when one doesn't exist, we are
> unnecessarily creating fully grouped rels in this case. For thousands of
> partitions that's a lot of memory wasted.
>
> I see a similar issue with create_grouping_paths() when we are computing only
> full aggregates (either because partial aggregation is not possible or because
> parallelism is not possible). In that case, we unconditionally create partially
> grouped rels. That too would waste a lot of memory.

This kind of goes along with the suggestion I made yesterday to
introduce a new function, which at the time I proposed calling
initialize_grouping_rel(), to set up new grouped or partially grouped
relations.  By doing that it would be easier to ensure the
initialization is always done in a consistent way but only for the
relations we actually need.  But maybe we should call it
fetch_grouping_rel() instead.  The idea would be that instead of
calling fetch_upper_rel() we would call fetch_grouping_rel() when it
is a question of the grouped or partially grouped relation.  It would
either return the existing relation or initialize a new one for us.  I
think that would make it fairly easy to initialize only the ones we're
going to need.

Also, I don't think we should be paranoid about memory usage here.
It's good to avoid creating new rels that are obviously not needed,
not only because of memory consumption but also because of the CPU
consumption involved, but I don't want to contort the code to squeeze
every last byte of memory out of this.

On a related note, I'm not sure that this code is correct:

+       if (!isPartialAgg)
+       {
+               grouped_rel->part_scheme = input_rel->part_scheme;
+               grouped_rel->nparts = nparts;
+               grouped_rel->boundinfo = input_rel->boundinfo;
+               grouped_rel->part_rels = part_rels;
+       }

It's not obvious to me why this should be done only when
!isPartialAgg.  The comments claim that the partially grouped child
rels can't be considered partitions of the top-level partitially
grouped rel, but it seems to me that we could consider them that way.
Maybe I'm missing something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Julian Markwort
Date:
Subject: Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)
Next
From: Masahiko Sawada
Date:
Subject: Re: [HACKERS] Subscription code improvements