Re: [HACKERS] Partition-wise aggregation/grouping - Mailing list pgsql-hackers

From Jeevan Chalke
Subject Re: [HACKERS] Partition-wise aggregation/grouping
Date
Msg-id CAM2+6=VR-CVW7i6KCq1bxXf7PniGyi_+xTJ36ar9xXr8=kGjkA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Partition-wise aggregation/grouping  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Responses Re: [HACKERS] Partition-wise aggregation/grouping  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
List pgsql-hackers


On Sun, Nov 12, 2017 at 1:59 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
On 10/27/2017 02:01 PM, Jeevan Chalke wrote:
Hi,

Attached new patch-set here. Changes include:

1. Added separate patch for costing Append node as discussed up-front in the
patch-set.
2. Since we now cost Append node, we don't need partition_wise_agg_cost_factor
GUC. So removed that. The remaining patch hence merged into main implementation
patch.
3. Updated rows in test-cases so that we will get partition-wise plans.

Thanks

I applied partition-wise-agg-v6.tar.gz patch to  the master and use shard.sh example from https://www.postgresql.org/message-id/14577.1509723225%40localhost
Plan for count(*) is the following:

shard=# explain select count(*) from orders;
                                      QUERY PLAN
---------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=100415.29..100415.30 rows=1 width=8)
   ->  Append  (cost=50207.63..100415.29 rows=2 width=8)
         ->  Partial Aggregate  (cost=50207.63..50207.64 rows=1 width=8)
               ->  Foreign Scan on orders_0 (cost=101.00..50195.13 rows=5000 width=0)
         ->  Partial Aggregate  (cost=50207.63..50207.64 rows=1 width=8)
               ->  Foreign Scan on orders_1 (cost=101.00..50195.13 rows=5000 width=0)


We really calculate partial aggregate for each partition, but to do we still have to fetch all data from remote host.
So for foreign partitions such plans is absolutely inefficient.
Amy be it should be combined with some other patch?
For example, with  agg_pushdown_v4.tgz patch https://www.postgresql.org/message-id/14577.1509723225%40localhost ?
But it is not applied after partition-wise-agg-v6.tar.gz patch.
Also postgres_fdw in 11dev is able to push down aggregates without agg_pushdown_v4.tgz patch.

In 0009-Teach-postgres_fdw-to-push-aggregates-for-child-rela.patch
there is the following check:

 /* Partial aggregates are not supported. */
+       if (extra->isPartial)
+           return;

If we just comment this line then produced plan will be the following:

shard=# explain select sum(product_id) from orders;
                           QUERY PLAN
----------------------------------------------------------------
 Finalize Aggregate  (cost=308.41..308.42 rows=1 width=8)
   ->  Append  (cost=144.18..308.41 rows=2 width=8)
         ->  Foreign Scan  (cost=144.18..154.20 rows=1 width=8)
               Relations: Aggregate on (public.orders_0 orders)
         ->  Foreign Scan  (cost=144.18..154.20 rows=1 width=8)
               Relations: Aggregate on (public.orders_1 orders)
(6 rows)

And it is actually desired plan!
Obviously such approach will not always work. FDW really doesn't support partial aggregates now.
But for most frequently used aggregates: sum, min, max, count aggtype==aggtranstype and there is no difference
between partial and normal aggregate calculation.
So instead of (extra->isPartial) condition we can add more complex check which will traverse pathtarget expressions and
check if it can be evaluated in this way. Or... extend FDW API to support partial aggregation.

As explained by Ashutosh Bapat in reply
https://www.postgresql.org/message-id/CAFjFpRdpeMTd8kYbM_x0769V-aEKst5Nkg3+coG=8ki7s8Zqjw@mail.gmail.com
we cannot rely on just aggtype==aggtranstype.

However, I have tried pushing partial aggregation over remote server and also
submitted a PoC patch here:
https://www.postgresql.org/message-id/CAM2+6=UakP9+TSJuh2fbhHWNJc7OYFL1_gvu7mt2fXtVt6GY3g@mail.gmail.com

I have later removed these patches from Partition-wise-Aggregation patch set
as it is altogether a different issue than this mail thread. We might need to
discuss on it separately.
 

But even the last plan is not ideal: it will calculate predicates at each remote node sequentially.
There is parallel append patch:
https://www.postgresql.org/message-id/CAJ3gD9ctEcrVUmpY6fq_JUB6WDKGXAGd70EY68jVFA4kxMbKeQ%40mail.gmail.com
but ... FDW doesn't support parallel scan, so parallel append can not be applied in this case.
And we actually do not need parallel append with all its dynamic workers here.
We just need to start commands at all remote servers and only after it fetch results (which can be done sequentially).

I am investigating problem of efficient execution of OLAP queries on sharded tables (tables with remote partitions).
After reading all this threads and corresponding  patches, it seems to me
that we already have most of parts of the puzzle, what we need is to put them on right places and may be add missed ones.
I wonder if somebody is busy with it and can I somehow help here?

Also I am not quite sure about the best approach with parallel execution of distributed query at all nodes.
Should we make postgres_fdw parallel safe and use parallel append? How difficult it will be?
Or in addition to parallel append we should also have "asynchronous append" which will be able to initiate execution at all nodes?
It seems to be close to merge append, because it should simultaneously traverse all cursors.

Looks like second approach is easier for implementation. But in case of sharded table, distributed query may need to traverse both remote
and local shards and this approach doesn't allow to processed several local shards in parallel.


Interesting idea of "asynchronous append". However, IMHO it deserves its own
email-chain.
 
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Thanks
--
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: [HACKERS] Transform for pl/perl
Next
From: Andreas Joseph Krogh
Date:
Subject: Sv: pspg - psql pager