Re: Performance Issue on Query 18 of TPC-H Benchmark - Mailing list pgsql-bugs

From Ba Jinsheng
Subject Re: Performance Issue on Query 18 of TPC-H Benchmark
Date
Msg-id SEZPR06MB649463AD78819D69D1C060C28A462@SEZPR06MB6494.apcprd06.prod.outlook.com
Whole thread Raw
In response to Re: Performance Issue on Query 18 of TPC-H Benchmark  (Alena Rybakina <a.rybakina@postgrespro.ru>)
Responses Re: Performance Issue on Query 18 of TPC-H Benchmark
List pgsql-bugs
>I would like to know if you can improve that case by switching from the sorted group to a hashed one.

I used this patch to enable the first HashAggregate only in the query plan:
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 0c7273b9cc..b410452df1 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6983,8 +6983,9 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
        bool            can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
        List       *havingQual = (List *) extra->havingQual;
        AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
-
-       if (can_sort)
+       static int call_count = 0;
+       call_count++;
+       if (can_sort && call_count != 2)
        {
                /*
                 * Use any available suitably-sorted path as input, and also consider

And got this query plan:
                                                                                                 QUERY PLAN                                                                                                  
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=693083.48..693083.73 rows=100 width=71) (actual time=2624.282..2629.392 rows=9 loops=1)
   ->  Sort  (cost=693083.48..694352.29 rows=507522 width=71) (actual time=2624.281..2629.390 rows=9 loops=1)
         Sort Key: orders.o_totalprice DESC, orders.o_orderdate
         Sort Method: quicksort  Memory: 25kB
         ->  HashAggregate  (cost=658421.05..673686.36 rows=507522 width=71) (actual time=2624.162..2629.346 rows=9 loops=1)
               Group Key: customer.c_custkey, orders.o_orderkey
               Planned Partitions: 32  Batches: 1  Memory Usage: 793kB
               ->  Gather  (cost=459569.18..608779.05 rows=507522 width=44) (actual time=2623.805..2629.229 rows=63 loops=1)
                     Workers Planned: 2
                     Workers Launched: 2
                     ->  Nested Loop  (cost=458569.18..557026.85 rows=211468 width=44) (actual time=2581.717..2620.494 rows=21 loops=3)
                           ->  Parallel Hash Join  (cost=458568.75..492734.09 rows=52844 width=43) (actual time=2581.704..2620.448 rows=3 loops=3)
                                 Hash Cond: (orders.o_custkey = customer.c_custkey)
                                 ->  Hash Join  (cost=453562.50..487589.13 rows=52844 width=24) (actual time=2541.024..2579.759 rows=3 loops=3)
                                       Hash Cond: (orders.o_orderkey = lineitem_1.l_orderkey)
                                       ->  Parallel Seq Scan on orders  (cost=0.00..32386.00 rows=625000 width=20) (actual time=0.028..32.135 rows=500000 loops=3)
                                       ->  Hash  (cost=451977.19..451977.19 rows=126825 width=4) (actual time=2515.787..2515.788 rows=9 loops=3)
                                             Buckets: 131072  Batches: 1  Memory Usage: 1025kB
                                             ->  GroupAggregate  (cost=0.43..451977.19 rows=126825 width=4) (actual time=608.052..2515.758 rows=9 loops=3)
                                                   Group Key: lineitem_1.l_orderkey
                                                   Filter: (sum(lineitem_1.l_quantity) > '314'::numeric)
                                                   Rows Removed by Filter: 1499991
                                                   ->  Index Scan using lineitem_pkey on lineitem lineitem_1  (cost=0.43..416256.96 rows=6002623 width=9) (actual time=0.043..1399.708 rows=6001215 loops=3)
                                 ->  Parallel Hash  (cost=4225.00..4225.00 rows=62500 width=23) (actual time=39.601..39.602 rows=50000 loops=3)
                                       Buckets: 262144  Batches: 1  Memory Usage: 10304kB
                                       ->  Parallel Seq Scan on customer  (cost=0.00..4225.00 rows=62500 width=23) (actual time=0.032..15.561 rows=50000 loops=3)
                           ->  Index Scan using lineitem_pkey on lineitem  (cost=0.43..1.06 rows=16 width=9) (actual time=0.012..0.014 rows=7 loops=9)
                                 Index Cond: (l_orderkey = orders.o_orderkey)
 Planning Time: 1.850 ms
 Execution Time: 2630.023 ms
(30 rows)

Compared to the query plan with GroupAggregate, both estimated cost and execution time are similar and have no significant difference.



>I think it was more interesting when I turned off parallelism and tried to build a query plan without AQO, and the execution time there was significantly reduced:
Turning off parallelism only brings this significant performance improvement?

Best regards,

Jinsheng Ba

 

Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #18657: Using JSON_OBJECTAGG with volatile function leads to segfault
Next
From: PG Bug reporting form
Date:
Subject: BUG #18658: Assert in SerialAdd() due to race condition