Re: Performance Issue on Query 18 of TPC-H Benchmark - Mailing list pgsql-bugs
From | Alena Rybakina |
---|---|
Subject | Re: Performance Issue on Query 18 of TPC-H Benchmark |
Date | |
Msg-id | 955eefa0-2158-4604-b79b-68a087646e91@postgrespro.ru Whole thread Raw |
In response to | Re: Performance Issue on Query 18 of TPC-H Benchmark (Ba Jinsheng <bajinsheng@u.nus.edu>) |
Responses |
Re: Performance Issue on Query 18 of TPC-H Benchmark
|
List | pgsql-bugs |
On 16.10.2024 17:56, Ba Jinsheng wrote:
To be honest, I don't quite understand what you are showing. I believe you are not allowing the optimizer to generate a different aggregation path (Group Aggregate) because it requires a sort operation. So I think this is not correct.P {margin-top:0;margin-bottom:0;} >I would like to know if you can improve that case by switching from the sorted group to a hashed one.I used this patch to enable the first HashAggregate only in the query plan:diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.cindex 0c7273b9cc..b410452df1 100644--- a/src/backend/optimizer/plan/planner.c+++ b/src/backend/optimizer/plan/planner.c@@ -6983,8 +6983,9 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,bool can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;List *havingQual = (List *) extra->havingQual;AggClauseCosts *agg_final_costs = &extra->agg_final_costs;-- if (can_sort)+ static int call_count = 0;+ call_count++;+ if (can_sort && call_count != 2){/** Use any available suitably-sorted path as input, and also considerAnd got this query plan:QUERY PLAN-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=693083.48..693083.73 rows=100 width=71) (actual time=2624.282..2629.392 rows=9 loops=1)-> Sort (cost=693083.48..694352.29 rows=507522 width=71) (actual time=2624.281..2629.390 rows=9 loops=1)Sort Key: orders.o_totalprice DESC, orders.o_orderdateSort Method: quicksort Memory: 25kB-> HashAggregate (cost=658421.05..673686.36 rows=507522 width=71) (actual time=2624.162..2629.346 rows=9 loops=1)Group Key: customer.c_custkey, orders.o_orderkeyPlanned Partitions: 32 Batches: 1 Memory Usage: 793kB-> Gather (cost=459569.18..608779.05 rows=507522 width=44) (actual time=2623.805..2629.229 rows=63 loops=1)Workers Planned: 2Workers Launched: 2-> Nested Loop (cost=458569.18..557026.85 rows=211468 width=44) (actual time=2581.717..2620.494 rows=21 loops=3)-> Parallel Hash Join (cost=458568.75..492734.09 rows=52844 width=43) (actual time=2581.704..2620.448 rows=3 loops=3)Hash Cond: (orders.o_custkey = customer.c_custkey)-> Hash Join (cost=453562.50..487589.13 rows=52844 width=24) (actual time=2541.024..2579.759 rows=3 loops=3)Hash Cond: (orders.o_orderkey = lineitem_1.l_orderkey)-> Parallel Seq Scan on orders (cost=0.00..32386.00 rows=625000 width=20) (actual time=0.028..32.135 rows=500000 loops=3)-> Hash (cost=451977.19..451977.19 rows=126825 width=4) (actual time=2515.787..2515.788 rows=9 loops=3)Buckets: 131072 Batches: 1 Memory Usage: 1025kB-> GroupAggregate (cost=0.43..451977.19 rows=126825 width=4) (actual time=608.052..2515.758 rows=9 loops=3)Group Key: lineitem_1.l_orderkeyFilter: (sum(lineitem_1.l_quantity) > '314'::numeric)Rows Removed by Filter: 1499991-> Index Scan using lineitem_pkey on lineitem lineitem_1 (cost=0.43..416256.96 rows=6002623 width=9) (actual time=0.043..1399.708 rows=6001215 loops=3)-> Parallel Hash (cost=4225.00..4225.00 rows=62500 width=23) (actual time=39.601..39.602 rows=50000 loops=3)Buckets: 262144 Batches: 1 Memory Usage: 10304kB-> Parallel Seq Scan on customer (cost=0.00..4225.00 rows=62500 width=23) (actual time=0.032..15.561 rows=50000 loops=3)-> Index Scan using lineitem_pkey on lineitem (cost=0.43..1.06 rows=16 width=9) (actual time=0.012..0.014 rows=7 loops=9)Index Cond: (l_orderkey = orders.o_orderkey)Planning Time: 1.850 msExecution Time: 2630.023 ms(30 rows)
Compared to the query plan with GroupAggregate, both estimated cost and execution time are similar and have no significant difference.
You may notice that disabling parallelism results in improved cardinality estimation and therefore a better query plan, since the optimizer selects paths based on their cost. If parallelism is disabled, query plan have become more correct.>I think it was more interesting when I turned off parallelism and tried to build a query plan without AQO, and the execution time there was significantly reduced:Turning off parallelism only brings this significant performance improvement?
-- Regards, Alena Rybakina Postgres Professional
pgsql-bugs by date: