Performance Issue on Query 18 of TPC-H Benchmark - Mailing list pgsql-bugs

From Ba Jinsheng
Subject Performance Issue on Query 18 of TPC-H Benchmark
Date
Msg-id SEZPR06MB6494B9CADF2FB15CDBCC7BBF8A452@SEZPR06MB6494.apcprd06.prod.outlook.com
Whole thread Raw
Responses Re: Performance Issue on Query 18 of TPC-H Benchmark
Re: Performance Issue on Query 18 of TPC-H Benchmark
Re: Performance Issue on Query 18 of TPC-H Benchmark
List pgsql-bugs
For this query 18 of TPC-H benchmark:
select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from CUSTOMER, ORDERS, LINEITEM where o_orderkey in ( select l_orderkey from LINEITEM group by l_orderkey having sum(l_quantity) > 314 ) and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc, o_orderdate limit 100;


Based on 1GB data (https://drive.google.com/file/d/1ZBLHanIRwxbaMQIhRUSPv4I7y8g_0AWi/view?usp=drive_link), its query plan (EXPLAIN ANALYZE) is as follows:
                                                                                                       QUERY PLAN                                                                                                        
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=666783.65..666783.90 rows=100 width=71) (actual time=2511.074..2517.146 rows=9 loops=1)
   ->  Sort  (cost=666783.65..668052.46 rows=507522 width=71) (actual time=2511.073..2517.144 rows=9 loops=1)
         Sort Key: orders.o_totalprice DESC, orders.o_orderdate
         Sort Method: quicksort  Memory: 25kB
         ->  Finalize GroupAggregate  (cost=583237.80..647386.53 rows=507522 width=71) (actual time=2511.057..2517.137 rows=9 loops=1)
               Group Key: customer.c_custkey, orders.o_orderkey
               ->  Gather Merge  (cost=583237.80..636813.14 rows=422936 width=71) (actual time=2511.053..2517.128 rows=9 loops=1)
                     Workers Planned: 2
                     Workers Launched: 2
                     ->  Partial GroupAggregate  (cost=582237.78..586995.81 rows=211468 width=71) (actual time=2507.789..2507.795 rows=3 loops=3)
                           Group Key: customer.c_custkey, orders.o_orderkey
                           ->  Sort  (cost=582237.78..582766.45 rows=211468 width=44) (actual time=2507.783..2507.786 rows=21 loops=3)
                                 Sort Key: customer.c_custkey, orders.o_orderkey
                                 Sort Method: quicksort  Memory: 26kB
                                 Worker 0:  Sort Method: quicksort  Memory: 26kB
                                 Worker 1:  Sort Method: quicksort  Memory: 25kB
                                 ->  Nested Loop  (cost=458569.18..557026.85 rows=211468 width=44) (actual time=2464.821..2507.757 rows=21 loops=3)
                                       ->  Parallel Hash Join  (cost=458568.75..492734.09 rows=52844 width=43) (actual time=2464.793..2507.699 rows=3 loops=3)
                                             Hash Cond: (orders.o_custkey = customer.c_custkey)
                                             ->  Hash Join  (cost=453562.50..487589.13 rows=52844 width=24) (actual time=2433.507..2476.356 rows=3 loops=3)
                                                   Hash Cond: (orders.o_orderkey = lineitem_1.l_orderkey)
                                                   ->  Parallel Seq Scan on orders  (cost=0.00..32386.00 rows=625000 width=20) (actual time=0.024..33.467 rows=500000 loops=3)
                                                   ->  Hash  (cost=451977.19..451977.19 rows=126825 width=4) (actual time=2412.836..2412.836 rows=9 loops=3)
                                                         Buckets: 131072  Batches: 1  Memory Usage: 1025kB
                                                         ->  GroupAggregate  (cost=0.43..451977.19 rows=126825 width=4) (actual time=708.272..2412.797 rows=9 loops=3)
                                                               Group Key: lineitem_1.l_orderkey
                                                               Filter: (sum(lineitem_1.l_quantity) > '314'::numeric)
                                                               Rows Removed by Filter: 1499991
                                                               ->  Index Scan using lineitem_pkey on lineitem lineitem_1  (cost=0.43..416256.96 rows=6002623 width=9) (actual time=0.052..1331.820 rows=6001215 loops=3)
                                             ->  Parallel Hash  (cost=4225.00..4225.00 rows=62500 width=23) (actual time=30.683..30.683 rows=50000 loops=3)
                                                   Buckets: 262144  Batches: 1  Memory Usage: 10304kB
                                                   ->  Parallel Seq Scan on customer  (cost=0.00..4225.00 rows=62500 width=23) (actual time=0.019..11.368 rows=50000 loops=3)
                                       ->  Index Scan using lineitem_pkey on lineitem  (cost=0.43..1.06 rows=16 width=9) (actual time=0.016..0.017 rows=7 loops=9)
                                             Index Cond: (l_orderkey = orders.o_orderkey)
 Planning Time: 0.833 ms
 Execution Time: 2517.189 ms
(36 rows)


While I found that if we disable the path sorting here:
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 0c7273b9cc..91320f473a 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6928,7 +6928,7 @@ make_ordered_path(PlannerInfo *root, RelOptInfo *rel, Path *path,
                                                                                        path->pathkeys,
                                                                                        &presorted_keys);
 
-       if (!is_sorted)
+       if (false)
        {
                /*
                 * Try at least sorting the cheapest path and also try incrementally

Both the performance and estimated cost are reduced around 40%:
                                                                                         QUERY PLAN                                                                                        
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=390889.59..390889.84 rows=100 width=71) (actual time=1652.572..1655.298 rows=9 loops=1)
   ->  Sort  (cost=390889.59..392158.39 rows=507522 width=71) (actual time=1652.571..1655.296 rows=9 loops=1)
         Sort Key: orders.o_totalprice DESC, orders.o_orderdate
         Sort Method: quicksort  Memory: 25kB
         ->  Finalize GroupAggregate  (cost=215938.45..371492.46 rows=507522 width=71) (actual time=1651.864..1655.256 rows=9 loops=1)
               Group Key: customer.c_custkey, orders.o_orderkey
               ->  Gather  (cost=215938.45..360919.08 rows=422936 width=71) (actual time=1651.670..1655.245 rows=9 loops=1)
                     Workers Planned: 2
                     Workers Launched: 2
                     ->  Partial GroupAggregate  (cost=214938.45..317625.48 rows=211468 width=71) (actual time=1616.827..1648.580 rows=3 loops=3)
                           Group Key: customer.c_custkey, orders.o_orderkey
                           ->  Nested Loop  (cost=214938.45..313396.12 rows=211468 width=44) (actual time=1609.796..1648.571 rows=21 loops=3)
                                 ->  Parallel Hash Join  (cost=214938.02..249103.36 rows=52844 width=43) (actual time=1609.777..1648.532 rows=3 loops=3)
                                       Hash Cond: (orders.o_custkey = customer.c_custkey)
                                       ->  Hash Join  (cost=209931.77..243958.39 rows=52844 width=24) (actual time=1573.950..1612.634 rows=3 loops=3)
                                             Hash Cond: (orders.o_orderkey = lineitem_1.l_orderkey)
                                             ->  Parallel Seq Scan on orders  (cost=0.00..32386.00 rows=625000 width=20) (actual time=0.016..32.211 rows=500000 loops=3)
                                             ->  Hash  (cost=208346.45..208346.45 rows=126825 width=4) (actual time=1549.849..1549.849 rows=9 loops=3)
                                                   Buckets: 131072  Batches: 1  Memory Usage: 1025kB
                                                   ->  GroupAggregate  (cost=0.00..208346.45 rows=126825 width=4) (actual time=334.397..1549.837 rows=9 loops=3)
                                                         Group Key: lineitem_1.l_orderkey
                                                         Filter: (sum(lineitem_1.l_quantity) > '314'::numeric)
                                                         Rows Removed by Filter: 1500388
                                                         ->  Seq Scan on lineitem lineitem_1  (cost=0.00..172626.23 rows=6002623 width=9) (actual time=0.051..438.226 rows=6001215 loops=3)
                                       ->  Parallel Hash  (cost=4225.00..4225.00 rows=62500 width=23) (actual time=34.762..34.763 rows=50000 loops=3)
                                             Buckets: 262144  Batches: 1  Memory Usage: 10304kB
                                             ->  Parallel Seq Scan on customer  (cost=0.00..4225.00 rows=62500 width=23) (actual time=0.011..10.406 rows=50000 loops=3)
                                 ->  Index Scan using lineitem_pkey on lineitem  (cost=0.43..1.06 rows=16 width=9) (actual time=0.009..0.011 rows=7 loops=9)
                                       Index Cond: (l_orderkey = orders.o_orderkey)
 Planning Time: 1.644 ms
 Execution Time: 1655.490 ms
(31 rows)

The major differerence between both query plans is the first one has additional SORT. I believe the second query plan is more efficient. Similar to my last report, perhaps we can optimize code to enable it.

I also tried 10 GB data, in which the execution time is reduced from 30s to 16s, but the estimated cost is increased. I can provide more info if you need.


Best regards,

Jinsheng Ba

 

Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #18656: "STABLE" function sometimes does not see changes
Next
From: David Rowley
Date:
Subject: Re: Performance Issue on Query 18 of TPC-H Benchmark