Home > mailing lists

Performance Issue on Query 18 of TPC-H Benchmark - Mailing list pgsql-bugs

From	Ba Jinsheng
Subject	Performance Issue on Query 18 of TPC-H Benchmark
Date	October 16 00:28:47
Msg-id	SEZPR06MB6494B9CADF2FB15CDBCC7BBF8A452@SEZPR06MB6494.apcprd06.prod.outlook.com Whole thread Raw
Responses	Re: Performance Issue on Query 18 of TPC-H Benchmark Re: Performance Issue on Query 18 of TPC-H Benchmark Re: Performance Issue on Query 18 of TPC-H Benchmark
List	pgsql-bugs

Tree view

For this query 18 of TPC-H benchmark:

select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from CUSTOMER, ORDERS, LINEITEM where o_orderkey in ( select l_orderkey from LINEITEM group by l_orderkey having sum(l_quantity) > 314 ) and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc, o_orderdate limit 100;

Based on 1GB data (https://drive.google.com/file/d/1ZBLHanIRwxbaMQIhRUSPv4I7y8g_0AWi/view?usp=drive_link), its query plan (EXPLAIN ANALYZE) is as follows:

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=666783.65..666783.90 rows=100 width=71) (actual time=2511.074..2517.146 rows=9 loops=1)

-> Sort (cost=666783.65..668052.46 rows=507522 width=71) (actual time=2511.073..2517.144 rows=9 loops=1)

Sort Key: orders.o_totalprice DESC, orders.o_orderdate

Sort Method: quicksort Memory: 25kB

-> Finalize GroupAggregate (cost=583237.80..647386.53 rows=507522 width=71) (actual time=2511.057..2517.137 rows=9 loops=1)

Group Key: customer.c_custkey, orders.o_orderkey

-> Gather Merge (cost=583237.80..636813.14 rows=422936 width=71) (actual time=2511.053..2517.128 rows=9 loops=1)

Workers Planned: 2

Workers Launched: 2

-> Partial GroupAggregate (cost=582237.78..586995.81 rows=211468 width=71) (actual time=2507.789..2507.795 rows=3 loops=3)

Group Key: customer.c_custkey, orders.o_orderkey

-> Sort (cost=582237.78..582766.45 rows=211468 width=44) (actual time=2507.783..2507.786 rows=21 loops=3)

Sort Key: customer.c_custkey, orders.o_orderkey

Sort Method: quicksort Memory: 26kB

Worker 0: Sort Method: quicksort Memory: 26kB

Worker 1: Sort Method: quicksort Memory: 25kB

-> Nested Loop (cost=458569.18..557026.85 rows=211468 width=44) (actual time=2464.821..2507.757 rows=21 loops=3)

-> Parallel Hash Join (cost=458568.75..492734.09 rows=52844 width=43) (actual time=2464.793..2507.699 rows=3 loops=3)

Hash Cond: (orders.o_custkey = customer.c_custkey)

-> Hash Join (cost=453562.50..487589.13 rows=52844 width=24) (actual time=2433.507..2476.356 rows=3 loops=3)

Hash Cond: (orders.o_orderkey = lineitem_1.l_orderkey)

-> Parallel Seq Scan on orders (cost=0.00..32386.00 rows=625000 width=20) (actual time=0.024..33.467 rows=500000 loops=3)

-> Hash (cost=451977.19..451977.19 rows=126825 width=4) (actual time=2412.836..2412.836 rows=9 loops=3)

Buckets: 131072 Batches: 1 Memory Usage: 1025kB

-> GroupAggregate (cost=0.43..451977.19 rows=126825 width=4) (actual time=708.272..2412.797 rows=9 loops=3)

Group Key: lineitem_1.l_orderkey

Filter: (sum(lineitem_1.l_quantity) > '314'::numeric)

Rows Removed by Filter: 1499991

-> Index Scan using lineitem_pkey on lineitem lineitem_1 (cost=0.43..416256.96 rows=6002623 width=9) (actual time=0.052..1331.820 rows=6001215 loops=3)

-> Parallel Hash (cost=4225.00..4225.00 rows=62500 width=23) (actual time=30.683..30.683 rows=50000 loops=3)

Buckets: 262144 Batches: 1 Memory Usage: 10304kB

-> Parallel Seq Scan on customer (cost=0.00..4225.00 rows=62500 width=23) (actual time=0.019..11.368 rows=50000 loops=3)

-> Index Scan using lineitem_pkey on lineitem (cost=0.43..1.06 rows=16 width=9) (actual time=0.016..0.017 rows=7 loops=9)

Index Cond: (l_orderkey = orders.o_orderkey)

Planning Time: 0.833 ms

Execution Time: 2517.189 ms

(36 rows)

While I found that if we disable the path sorting here:

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c

index 0c7273b9cc..91320f473a 100644

--- a/src/backend/optimizer/plan/planner.c

+++ b/src/backend/optimizer/plan/planner.c

@@ -6928,7 +6928,7 @@ make_ordered_path(PlannerInfo *root, RelOptInfo *rel, Path *path,

path->pathkeys,

&presorted_keys);

- if (!is_sorted)

+ if (false)

{

* Try at least sorting the cheapest path and also try incrementally

Both the performance and estimated cost are reduced around 40%:

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=390889.59..390889.84 rows=100 width=71) (actual time=1652.572..1655.298 rows=9 loops=1)

-> Sort (cost=390889.59..392158.39 rows=507522 width=71) (actual time=1652.571..1655.296 rows=9 loops=1)

Sort Key: orders.o_totalprice DESC, orders.o_orderdate

Sort Method: quicksort Memory: 25kB

-> Finalize GroupAggregate (cost=215938.45..371492.46 rows=507522 width=71) (actual time=1651.864..1655.256 rows=9 loops=1)

Group Key: customer.c_custkey, orders.o_orderkey

-> Gather (cost=215938.45..360919.08 rows=422936 width=71) (actual time=1651.670..1655.245 rows=9 loops=1)

Workers Planned: 2

Workers Launched: 2

-> Partial GroupAggregate (cost=214938.45..317625.48 rows=211468 width=71) (actual time=1616.827..1648.580 rows=3 loops=3)

Group Key: customer.c_custkey, orders.o_orderkey

-> Nested Loop (cost=214938.45..313396.12 rows=211468 width=44) (actual time=1609.796..1648.571 rows=21 loops=3)

-> Parallel Hash Join (cost=214938.02..249103.36 rows=52844 width=43) (actual time=1609.777..1648.532 rows=3 loops=3)

Hash Cond: (orders.o_custkey = customer.c_custkey)

-> Hash Join (cost=209931.77..243958.39 rows=52844 width=24) (actual time=1573.950..1612.634 rows=3 loops=3)

Hash Cond: (orders.o_orderkey = lineitem_1.l_orderkey)

-> Parallel Seq Scan on orders (cost=0.00..32386.00 rows=625000 width=20) (actual time=0.016..32.211 rows=500000 loops=3)

-> Hash (cost=208346.45..208346.45 rows=126825 width=4) (actual time=1549.849..1549.849 rows=9 loops=3)

Buckets: 131072 Batches: 1 Memory Usage: 1025kB

-> GroupAggregate (cost=0.00..208346.45 rows=126825 width=4) (actual time=334.397..1549.837 rows=9 loops=3)

Group Key: lineitem_1.l_orderkey

Filter: (sum(lineitem_1.l_quantity) > '314'::numeric)

Rows Removed by Filter: 1500388

-> Seq Scan on lineitem lineitem_1 (cost=0.00..172626.23 rows=6002623 width=9) (actual time=0.051..438.226 rows=6001215 loops=3)

-> Parallel Hash (cost=4225.00..4225.00 rows=62500 width=23) (actual time=34.762..34.763 rows=50000 loops=3)

Buckets: 262144 Batches: 1 Memory Usage: 10304kB

-> Parallel Seq Scan on customer (cost=0.00..4225.00 rows=62500 width=23) (actual time=0.011..10.406 rows=50000 loops=3)

-> Index Scan using lineitem_pkey on lineitem (cost=0.43..1.06 rows=16 width=9) (actual time=0.009..0.011 rows=7 loops=9)

Index Cond: (l_orderkey = orders.o_orderkey)

Planning Time: 1.644 ms

Execution Time: 1655.490 ms

(31 rows)

The major differerence between both query plans is the first one has additional SORT. I believe the second query plan is more efficient. Similar to my last report, perhaps we can optimize code to enable it.

I also tried 10 GB data, in which the execution time is reduced from 30s to 16s, but the estimated cost is increased. I can provide more info if you need.

Best regards,

Jinsheng Ba

Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.

pgsql-bugs by date:

From: Tom Lane
Date: 15 October, 23:51:15
Subject: Re: BUG #18656: "STABLE" function sometimes does not see changes

From: David Rowley
Date: 16 October, 03:56:08
Subject: Re: Performance Issue on Query 18 of TPC-H Benchmark

Performance Issue on Query 18 of TPC-H Benchmark - Mailing list pgsql-bugs

Previous

Next