Thread: excessive performance difference
We're running a query on psql7.2b3, in which planner is unable to find correct way to solve. This is only informative message If we run query19.old it takes more than 7 hours with this plan Aggregate (cost=17310310.04..17310310.04 rows=1 width=116) -> Nested Loop (cost=0.00..17310310.00 rows=15 width=116) -> Seq Scan on part (cost=0.00..140.00 rows=2000 width=36) -> Seq Scan on lineitem (cost=0.00..2325.79 rows=60279 width=80) If we use query19.sql it takes 6 secs and plan is Nested Loop (cost=9770.00..9770.02 rows=1 width=96) (actual time=6538.94..6539.09 rows=1 loops=1) -> Nested Loop (cost=6513.40..6513.41 rows=1 width=64) (actual time=4472.32..4472.41 rows=1 loops=1) -> Subquery Scan resultado (cost=3256.82..3256.82 rows=1 width=30) (actual time=2283.08..2283.11 rows=1 loops=1) -> Aggregate (cost=3256.82..3256.82 rows=1 width=30) (actual time=2283.03..2283.04 rows=1 loops=1) -> Hash Join (cost=175.00..3256.81 rows=1 width=30) (actual time=688.18..2282.70 rows=1 loops=1) -> Seq Scan on lineitem (cost=0.00..3079.28 rows=506 width=26) (actual time=6.29..2217.74 rows=492 loops=1) -> Hash (cost=175.00..175.00 rows=1 width=4) (actual time=55.20..55.20 rows=0 loops=1) -> Seq Scan on part (cost=0.00..175.00 rows=1 width=4) (actual time=42.12..55.11 rows=2 loops=1) -> Subquery Scan resultado2 (cost=3256.59..3256.59 rows=1 width=30) (actual time=2189.15..2189.18 rows=1 loops=1) -> Aggregate (cost=3256.59..3256.59 rows=1 width=30) (actual time=2189.10..2189.11 rows=1 loops=1) -> Hash Join (cost=175.00..3256.58 rows=1 width=30) (actual time=2188.99..2188.99 rows=0 loops=1) -> Seq Scan on lineitem (cost=0.00..3079.28 rows=460 width=26) (actual time=2.53..2124.05 rows=519 loops=1) -> Hash (cost=175.00..175.00 rows=1 width=4) (actual time=54.97..54.97 rows=0 loops=1) -> Seq Scan on part (cost=0.00..175.00 rows=1 width=4) (actual time=21.05..54.85 rows=3 loops=1) -> Subquery Scan resultado3 (cost=3256.59..3256.59 rows=1 width=30) (actual time=2066.41..2066.44 rows=1 loops=1) -> Aggregate (cost=3256.59..3256.59 rows=1 width=30) (actual time=2066.36..2066.37 rows=1 loops=1) -> Hash Join (cost=175.01..3256.59 rows=1 width=30) (actual time=2066.24..2066.24 rows=0 loops=1) -> Seq Scan on lineitem (cost=0.00..3079.28 rows=460 width=26) (actual time=2.25..2001.36 rows=483 loops=1) -> Hash (cost=175.00..175.00 rows=2 width=4) (actual time=55.48..55.48 rows=0 loops=1) -> Seq Scan on part (cost=0.00..175.00 rows=2 width=4) (actual time=17.21..55.34 rows=3 loops=1) explain SELECT sum(lineitem.extendedprice*(1-lineitem.discount))AS revenue FROM part, lineitem WHERE ( part.partkey=lineitem.partkey AND part.brand='Brand#12' AND part.container IN('SM CASE','SM BOX','SM PACK','SM PKG') AND lineitem.quantity>=1 AND lineitem.quantity<=1+10 AND part.size BETWEEN 1 AND 5 AND lineitem.shipmode IN ('AIR','AIR REG') AND lineitem.shipinstruct='DELIVER IN PERSON' ) OR ( part.partkey=lineitem.partkey AND part.brand='Brand#23' AND part.container IN('MED BAG','MED BOX','MED PACK','MED PKG') AND lineitem.quantity>=10 AND lineitem.quantity<=10+10 AND part.size BETWEEN 1 AND 10 AND lineitem.shipmode IN ('AIR','AIR REG') AND lineitem.shipinstruct='DELIVER IN PERSON' ) OR ( part.partkey=lineitem.partkey AND part.brand='Brand#34' AND part.container IN('LG CASE','LG BOX','LG PACK','LG PKG') AND lineitem.quantity>=20 AND lineitem.quantity<=20+10 AND part.size BETWEEN 1 AND 15 AND lineitem.shipmode IN ('AIR','AIR REG') AND lineitem.shipinstruct='DELIVER IN PERSON' ); SELECT (suma+suma1+suma2)AS revenue FROM (SELECT CASE WHEN sum(extendedprice*(1-discount))>0 THEN sum(extendedprice*(1-discount)) ELSE 0 END AS suma FROM part, lineitem WHERE ( part.partkey=lineitem.partkey AND part.brand='Brand#12' AND part.container IN('SM CASE','SM BOX','SM PACK','SM PKG') AND lineitem.quantity>=1 AND lineitem.quantity<=1+10 AND part.size BETWEEN 1 AND 5 AND lineitem.shipmode IN ('AIR','AIR REG') AND lineitem.shipinstruct='DELIVER IN PERSON' ))AS resultado, (SELECT CASE WHEN sum(extendedprice*(1-discount))>0 THEN sum(extendedprice*(1-discount)) ELSE 0 END AS suma1 FROM part, lineitem WHERE ( part.partkey=lineitem.partkey AND part.brand='Brand#23' AND part.container IN('MED BAG','MED BOX','MED PACK','MED PKG') AND lineitem.quantity>=10 AND lineitem.quantity<=20 AND part.size BETWEEN 1 AND 10 AND lineitem.shipmode IN ('AIR','AIR REG') AND lineitem.shipinstruct='DELIVER IN PERSON' ))AS resultado2, (SELECT CASE WHEN sum(extendedprice*(1-discount))>0 THEN sum(extendedprice*(1-discount)) ELSE 0 END AS suma2 FROM part, lineitem WHERE ( part.partkey=lineitem.partkey AND part.brand='Brand#34' AND part.container IN('LG CASE','LG BOX','LG PACK','LG PKG') AND lineitem.quantity>=20 AND lineitem.quantity<=30 AND part.size BETWEEN 1 AND 15 AND lineitem.shipmode IN ('AIR','AIR REG') AND lineitem.shipinstruct='DELIVER IN PERSON' ))AS resultado3;
On Thu, 13 Dec 2001 13:35:16 +0100 Luis Amigo <lamigo@atc.unican.es> wrote: > We're running a query on psql7.2b3, in which planner is unable to find > correct way to solve. > This is only informative message > If we run query19.old it takes more than 7 hours with this plan > > Aggregate (cost=17310310.04..17310310.04 rows=1 width=116) > -> Nested Loop (cost=0.00..17310310.00 rows=15 width=116) > -> Seq Scan on part (cost=0.00..140.00 rows=2000 width=36) > -> Seq Scan on lineitem (cost=0.00..2325.79 rows=60279 > width=80) > > If we use query19.sql it takes 6 secs and plan is > > Nested Loop (cost=9770.00..9770.02 rows=1 width=96) (actual > time=6538.94..6539.09 rows=1 loops=1) I found that the query, query19.old, has no hints to run the planner effectively. I would think you need to change a bit like this: explain SELECT sum(lineitem.extendedprice*(1-lineitem.discount))AS revenue FROM part, lineitem WHERE part.partkey=lineitem.partkey -- change AND (( part.brand='Brand#12' AND part.container IN('SM CASE','SM BOX','SM PACK','SM PKG') AND lineitem.quantity>=1 AND lineitem.quantity<=1+10 AND part.size BETWEEN 1 AND 5 AND lineitem.shipmode IN ('AIR','AIR REG') AND lineitem.shipinstruct='DELIVER IN PERSON' ) OR ( part.brand='Brand#23' AND part.container IN('MED BAG','MED BOX','MED PACK','MED PKG') AND lineitem.quantity>=10 AND lineitem.quantity<=10+10 AND part.size BETWEEN 1 AND 10 AND lineitem.shipmode IN ('AIR','AIR REG') AND lineitem.shipinstruct='DELIVER IN PERSON' ) OR ( part.brand='Brand#34' AND part.container IN('LG CASE','LG BOX','LG PACK','LG PKG') AND lineitem.quantity>=20 AND lineitem.quantity<=20+10 AND part.size BETWEEN 1 AND 15 AND lineitem.shipmode IN ('AIR','AIR REG') AND lineitem.shipinstruct='DELIVER IN PERSON' )); NOTICE: QUERY PLAN: Aggregate (cost=344.66..344.66 rows=1 width=72) -> Merge Join (cost=139.66..344.66 rows=1 width=72) -> Sort (cost=69.83..69.83 rows=1000 width=32) -> Seq Scan on part (cost=0.00..20.00 rows=1000 width=32) -> Sort (cost=69.83..69.83 rows=1000 width=40) -> Seq Scan on lineitem (cost=0.00..20.00 rows=1000 width=40) AFAICT, you probably might use indices to remove the two sorts from the previous result of the query plan. create index idx_part_partkey on part (partkey); create index idx_lineitem_partkey on lineitem (partkey); NOTICE: QUERY PLAN: Aggregate (cost=323.01..323.01 rows=1 width=72) -> Merge Join (cost=0.00..323.01 rows=1 width=72) -> Index Scan using idx_part_partkey on part (cost=0.00..59.00 rows=1000 width=32) -> Index Scan using idx_lineitem_partkey on lineitem (cost=0.00..59.00 rows=1000 width=40) Regards, Masaru Sugawara
Masaru Sugawara wrote: > > I found that the query, query19.old, has no hints to run the planner > effectively. I would think you need to change a bit like this: Thank you for your answer Masaru, the only thing I wanted to communicate is that query19.old works perfectly with pay bases, and someone could get stucked with the same problem. What you are getting outside is same condition that is inside of all, partkey is primary key, so it is not neccesary to create index. query19.sql gives better performance than yours, it is pseudo intra-paralellism, if anyone is interested here are the results tpch=# \e /disco3/lamigo/tpch/consultas/query19.old NOTICE: QUERY PLAN: Aggregate (cost=6073.03..6073.03 rows=1 width=116) (actual time=6865.81..6865.81 rows=1 loops=1) -> Hash Join (cost=145.00..6073.03 rows=1 width=116) (actual time=2057.06..6865.58 rows=1 loops=1) -> Seq Scan on lineitem (cost=0.00..2326.05 rows=60305 width=80) (actual time=0.38..3244.77 rows=60279 loops=1) -> Hash (cost=140.00..140.00 rows=2000 width=36) (actual time=132.55..132.55 rows=0 loops=1) -> Seq Scan on part (cost=0.00..140.00 rows=2000 width=36) (actual time=13.66..98.65 rows=2000 loops=1) Total runtime: 6867.08 msec EXPLAIN tpch=# \e /disco3/lamigo/tpch/consultas/query19.sql NOTICE: QUERY PLAN: Nested Loop (cost=9771.17..9771.19 rows=1 width=96) (actual time=6539.58..6539.74 rows=1 loops=1) -> Nested Loop (cost=6513.97..6513.98 rows=1 width=64) (actual time=4254.88..4254.98 rows=1 loops=1) -> Subquery Scan resultado2 (cost=3257.16..3257.16 rows=1 width=30) (actual time=2185.23..2185.27 rows=1 loops=1) -> Aggregate (cost=3257.16..3257.16 rows=1 width=30) (actual time=2185.19..2185.20 rows=1 loops=1) -> Hash Join (cost=175.00..3257.16 rows=1 width=30) (actual time=2185.07..2185.07 rows=0 loops=1) -> Seq Scan on lineitem (cost=0.00..3079.86 rows=458 width=26) (actual time=2.50..2119.31 rows=519 loops=1) -> Hash (cost=175.00..175.00 rows=1 width=4) (actual time=55.40..55.40 rows=0 loops=1) -> Seq Scan on part (cost=0.00..175.00 rows=1 width=4) (actual time=21.11..55.27 rows=3 loops=1) -> Subquery Scan resultado3 (cost=3256.81..3256.81 rows=1 width=30) (actual time=2069.55..2069.59 rows=1 loops=1) -> Aggregate (cost=3256.81..3256.81 rows=1 width=30) (actual time=2069.51..2069.51 rows=1 loops=1) -> Hash Join (cost=175.01..3256.81 rows=1 width=30) (actual time=2069.39..2069.39 rows=0 loops=1) -> Seq Scan on lineitem (cost=0.00..3079.86 rows=387 width=26) (actual time=2.24..2004.59 rows=483 loops=1) -> Hash (cost=175.00..175.00 rows=2 width=4) (actual time=55.00..55.00 rows=0 loops=1) -> Seq Scan on part (cost=0.00..175.00 rows=2 width=4) (actual time=17.20..54.87 rows=3 loops=1) -> Subquery Scan resultado (cost=3257.19..3257.19 rows=1 width=30) (actual time=2284.52..2284.55 rows=1 loops=1) -> Aggregate (cost=3257.19..3257.19 rows=1 width=30) (actual time=2284.47..2284.48 rows=1 loops=1) -> Hash Join (cost=175.00..3257.19 rows=1 width=30) (actual time=688.98..2284.13 rows=1 loops=1) -> Seq Scan on lineitem (cost=0.00..3079.86 rows=464 width=26) (actual time=6.31..2219.45 rows=492 loops=1) -> Hash (cost=175.00..175.00 rows=1 width=4) (actual time=54.99..54.99 rows=0 loops=1) -> Seq Scan on part (cost=0.00..175.00 rows=1 width=4) (actual time=42.02..54.89 rows=2 loops=1) Total runtime: 6541.95 msec EXPLAIN tpch=# create index index_partkey on lineitem (partkey int4_ops); CREATE tpch=# analyze; ANALYZE tpch=# \e /disco3/lamigo/tpch/consultas/query19.old ANALYZE tpch=# \i /disco3/lamigo/tpch/consultas/query19.old psql:/disco3/lamigo/tpch/consultas/query19.old:35: NOTICE: QUERY PLAN: Aggregate (cost=6183.50..6183.50 rows=1 width=116) (actual time=6816.22..6816.22 rows=1 loops=1) -> Hash Join (cost=145.00..6183.50 rows=1 width=116) (actual time=2013.52..6815.98 rows=1 loops=1) -> Seq Scan on lineitem (cost=0.00..2326.05 rows=60305 width=80) (actual time=0.38..3220.05 rows=60279 loops=1) -> Hash (cost=140.00..140.00 rows=2000 width=36) (actual time=108.44..108.44 rows=0 loops=1) -> Seq Scan on part (cost=0.00..140.00 rows=2000 width=36) (actual time=1.02..76.31 rows=2000 loops=1) Total runtime: 6817.15 msec Thank you and regards. Luis Amigo Universidad de Cantabria