Re: [HACKERS] Aggregates push-down to partitions - Mailing list pgsql-hackers

From Maksim Milyutin
Subject Re: [HACKERS] Aggregates push-down to partitions
Date
Msg-id 73ca1970-d182-0889-4d63-ec369a68d788@gmail.com
Whole thread Raw
In response to [HACKERS] Aggregates push-down to partitions  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Responses Re: [HACKERS] Aggregates push-down to partitions  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
List pgsql-hackers
Hi Konstantin!


09.11.17 20:14, Konstantin Knizhnik wrote:
> It is still far from ideal plan because each worker is working with 
> all partitions, instead of spitting partitions between workers and 
> calculate partial aggregates for each partition.
>
> But if we add FDW as a child of parent table, then parallel scan can 
> not be used and we get the worst possible plan:
>
> postgres=# create foreign table derived_fdw() inherits(base) server 
> pg_fdw options (table_name 'derived1');CREATE FOREIGN TABLE
> postgres=# explain select sum(x) from base;
>                                     QUERY PLAN
> ---------------------------------------------------------------------------------- 
>
>  Aggregate  (cost=34055.07..34055.08 rows=1 width=8)
>    ->  Append  (cost=0.00..29047.75 rows=2002926 width=4)
>          ->  Seq Scan on base  (cost=0.00..0.00 rows=1 width=4)
>          ->  Seq Scan on derived1  (cost=0.00..14425.00 rows=1000000 
> width=4)
>          ->  Seq Scan on derived2  (cost=0.00..14425.00 rows=1000000 
> width=4)
>          ->  Foreign Scan on derived_fdw  (cost=100.00..197.75 
> rows=2925 width=4)
> (6 rows)
>
> So we sequentially pull all data to this node and compute aggregates 
> locally.
> Ideal plan will calculate in parallel partial aggregates at all nodes 
> and then combine partial results.
> It requires two changes:
> 1. Replace Aggregate->Append with 
> Finalize_Aggregate->Append->Partial_Aggregate
> 2. Concurrent execution of Append. It also can be done in two 
> different ways: we can try to use existed parallel workers 
> infrastructure and
> replace Append with Gather. It seems to be the best approach for local 
> partitioning. In case of remote (FDW) partitions, it is enough
> to split starting of execution (PQsendQuery in postgres_fdw) and 
> getting results. So it requires some changes in FDW protocol.
>
>
> I wonder if somebody already investigate this problem or working in 
> this direction.
> May be there are already some patches proposed?
> I have searched hackers archive, but didn't find something relevant...
> Are there any suggestions about the best approach to implement this 
> feature?
>

Maybe in this thread[1] your described problem are solved through 
introducing Parallel Append node?

1. 
https://www.postgresql.org/message-id/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1%2BS%2BvRuUQ%40mail.gmail.com

-- 
Regards,
Maksim Milyutin



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Simplify ACL handling for large objects and removal ofsuperuser() checks
Next
From: Stephen Frost
Date:
Subject: Re: [HACKERS] Simplify ACL handling for large objects and removal ofsuperuser() checks