Home > mailing lists

Re: Performance Issue on Query 18 of TPC-H Benchmark - Mailing list pgsql-bugs

From	Alena Rybakina
Subject	Re: Performance Issue on Query 18 of TPC-H Benchmark
Date	October 16 22:42:52
Msg-id	955eefa0-2158-4604-b79b-68a087646e91@postgrespro.ru Whole thread Raw
In response to	Re: Performance Issue on Query 18 of TPC-H Benchmark (Ba Jinsheng <bajinsheng@u.nus.edu>)
Responses	Re: Performance Issue on Query 18 of TPC-H Benchmark
List	pgsql-bugs

Tree view

On 16.10.2024 17:56, Ba Jinsheng wrote:

P {margin-top:0;margin-bottom:0;}
>I would like to know if you can improve that case by switching from the sorted group to a hashed one.

I used this patch to enable the first HashAggregate only in the query plan:
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 0c7273b9cc..b410452df1 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6983,8 +6983,9 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
bool can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
List *havingQual = (List *) extra->havingQual;
AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
-
- if (can_sort)
+ static int call_count = 0;
+ call_count++;
+ if (can_sort && call_count != 2)
{
/*
* Use any available suitably-sorted path as input, and also consider

And got this query plan:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=693083.48..693083.73 rows=100 width=71) (actual time=2624.282..2629.392 rows=9 loops=1)
-> Sort (cost=693083.48..694352.29 rows=507522 width=71) (actual time=2624.281..2629.390 rows=9 loops=1)
Sort Key: orders.o_totalprice DESC, orders.o_orderdate
Sort Method: quicksort Memory: 25kB
-> HashAggregate (cost=658421.05..673686.36 rows=507522 width=71) (actual time=2624.162..2629.346 rows=9 loops=1)
Group Key: customer.c_custkey, orders.o_orderkey
Planned Partitions: 32 Batches: 1 Memory Usage: 793kB
-> Gather (cost=459569.18..608779.05 rows=507522 width=44) (actual time=2623.805..2629.229 rows=63 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Nested Loop (cost=458569.18..557026.85 rows=211468 width=44) (actual time=2581.717..2620.494 rows=21 loops=3)
-> Parallel Hash Join (cost=458568.75..492734.09 rows=52844 width=43) (actual time=2581.704..2620.448 rows=3 loops=3)
Hash Cond: (orders.o_custkey = customer.c_custkey)
-> Hash Join (cost=453562.50..487589.13 rows=52844 width=24) (actual time=2541.024..2579.759 rows=3 loops=3)
Hash Cond: (orders.o_orderkey = lineitem_1.l_orderkey)
-> Parallel Seq Scan on orders (cost=0.00..32386.00 rows=625000 width=20) (actual time=0.028..32.135 rows=500000 loops=3)
-> Hash (cost=451977.19..451977.19 rows=126825 width=4) (actual time=2515.787..2515.788 rows=9 loops=3)
Buckets: 131072 Batches: 1 Memory Usage: 1025kB
-> GroupAggregate (cost=0.43..451977.19 rows=126825 width=4) (actual time=608.052..2515.758 rows=9 loops=3)
Group Key: lineitem_1.l_orderkey
Filter: (sum(lineitem_1.l_quantity) > '314'::numeric)
Rows Removed by Filter: 1499991
-> Index Scan using lineitem_pkey on lineitem lineitem_1 (cost=0.43..416256.96 rows=6002623 width=9) (actual time=0.043..1399.708 rows=6001215 loops=3)
-> Parallel Hash (cost=4225.00..4225.00 rows=62500 width=23) (actual time=39.601..39.602 rows=50000 loops=3)
Buckets: 262144 Batches: 1 Memory Usage: 10304kB
-> Parallel Seq Scan on customer (cost=0.00..4225.00 rows=62500 width=23) (actual time=0.032..15.561 rows=50000 loops=3)
-> Index Scan using lineitem_pkey on lineitem (cost=0.43..1.06 rows=16 width=9) (actual time=0.012..0.014 rows=7 loops=9)
Index Cond: (l_orderkey = orders.o_orderkey)
Planning Time: 1.850 ms
Execution Time: 2630.023 ms
(30 rows)

Compared to the query plan with GroupAggregate, both estimated cost and execution time are similar and have no significant difference.

To be honest, I don't quite understand what you are showing. I believe you are not allowing the optimizer to generate a different aggregation path (Group Aggregate) because it requires a sort operation. So I think this is not correct.

>I think it was more interesting when I turned off parallelism and tried to build a query plan without AQO, and the execution time there was significantly reduced:
Turning off parallelism only brings this significant performance improvement?

You may notice that disabling parallelism results in improved cardinality estimation and therefore a better query plan, since the optimizer selects paths based on their cost. If parallelism is disabled, query plan have become more correct.

-- 
Regards,
Alena Rybakina
Postgres Professional

pgsql-bugs by date:

From: PG Bug reporting form
Date: 16 October, 21:32:03
Subject: BUG #18658: Assert in SerialAdd() due to race condition

From: Fujii Masao
Date: 16 October, 22:59:10
Subject: Re: [BUGS] BUG #10123: Weird entries in pg_stat_activity

Re: Performance Issue on Query 18 of TPC-H Benchmark - Mailing list pgsql-bugs

Previous

Next