Re: [HACKERS] PoC: Grouped base relation - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: [HACKERS] PoC: Grouped base relation
Date
Msg-id 24177.1484826929@localhost
Whole thread Raw
In response to Re: [HACKERS] PoC: Grouped base relation  (Antonin Houska <ah@cybertec.at>)
List pgsql-hackers
Antonin Houska <ah@cybertec.at> wrote:

Well, the following one does not seem to be a typical example. I could
generate the plan, but now I think that the aggregation push down does not in
general decrease the number of groups the final aggregation has to
process. Maybe I just hit planner limitation to estimate the number of groups
within append relation.

> For this query
>
> SELECT
>         p.id, sum(price)
> FROM
>         products AS p
>         JOIN sales AS s ON s.product_id = p.id
> GROUP BY
>         p.id
>
> I get this plan at "normal circumstances"
>
>  HashAggregate
>    Group Key: p.id
>    ->  Hash Join
>          Hash Cond: (s.product_id = p.id)
>          ->  Gather
>                Workers Planned: 2
>                ->  Append
>                      ->  Parallel Seq Scan on sales s
>                      ->  Parallel Seq Scan on sales_2015 s_1
>                      ->  Parallel Seq Scan on sales_2016 s_2
>                      ->  Parallel Seq Scan on sales_2017 s_3
>          ->  Hash
>                ->  Gather
>                      Workers Planned: 2
>                      ->  Append
>                            ->  Parallel Seq Scan on products p
>                            ->  Parallel Seq Scan on products_01 p_1
>                            ->  Parallel Seq Scan on products_02 p_2
>                            ->  Parallel Seq Scan on products_03 p_3
>                            ->  Parallel Seq Scan on products_04 p_4
>
>
> but if work_mem is sufficiently low for the hash join to be efficient, the
> aggregation can be moved to individual partitions.
>
>  Gather
>    Workers Planned: 1
>    Single Copy: true
>    ->  Finalize HashAggregate
>          Group Key: p.id
>          ->  Hash Join
>                Hash Cond: (p.id = s.product_id)
>                ->  Append
>                      ->  Partial HashAggregate
>                            Group Key: p.id
>                            ->  Seq Scan on products p
>                      ->  Partial HashAggregate
>                            Group Key: p_1.id
>                            ->  Seq Scan on products_01 p_1
>                      ->  Partial HashAggregate
>                            Group Key: p_2.id
>                            ->  Seq Scan on products_02 p_2
>                      ->  Partial HashAggregate
>                            Group Key: p_3.id
>                            ->  Seq Scan on products_03 p_3
>                      ->  Partial HashAggregate
>                            Group Key: p_4.id
>                            ->  Seq Scan on products_04 p_4
>                ->  Hash
>                      ->  Append
>                            ->  Partial HashAggregate
>                                  Group Key: s.product_id
>                                  ->  Seq Scan on sales s
>                            ->  Partial HashAggregate
>                                  Group Key: s_1.product_id
>                                  ->  Seq Scan on sales_2015 s_1
>                            ->  Partial HashAggregate
>                                  Group Key: s_2.product_id
>                                  ->  Seq Scan on sales_2016 s_2
>                            ->  Partial HashAggregate
>                                  Group Key: s_3.product_id
>                                  ->  Seq Scan on sales_2017 s_3

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at



pgsql-hackers by date:

Previous
From: Kuntal Ghosh
Date:
Subject: Re: [HACKERS] parallelize queries containing subplans
Next
From: Rafia Sabih
Date:
Subject: Re: [HACKERS] Parallel Index-only scan