Re: TPC-R benchmarks - Mailing list pgsql-performance
From | scott.marlowe |
---|---|
Subject | Re: TPC-R benchmarks |
Date | |
Msg-id | Pine.LNX.4.33.0310021027500.24503-100000@css120.ihs.com Whole thread Raw |
In response to | Re: TPC-R benchmarks (Oleg Lebedev <oleg.lebedev@waterford.org>) |
List | pgsql-performance |
Have you tried increasing the statistics target for those columns that are getting bad estimates yet and then turning back on enable_nestloop and rerunning analyze and seeing how the query does? The idea being to try and get a good enough estimate of your statistics so the planner stops using nestloops on its own rather than forcing it to with enable_nestloop = false. On Thu, 2 Oct 2003, Oleg Lebedev wrote: > As Scott recommended, I did the following: > # set enable_nestloop = false; > # vacuum full analyze; > > After this I re-ran the query and its execution time went down from 2 > hours to 2 minutes. I attached the new query plan to this posting. > Is there any way to optimize it even further? > What should I do to make this query run fast without hurting the > performance of the other queries? > Thanks. > > Oleg > > -----Original Message----- > From: scott.marlowe [mailto:scott.marlowe@ihs.com] > Sent: Wednesday, October 01, 2003 4:00 PM > To: Oleg Lebedev > Cc: Josh Berkus; pgsql-performance@postgresql.org > Subject: Re: [PERFORM] TPC-R benchmarks > > > For troubleshooting, can you try it with "set enable_nestloop = false" > and > rerun the query and see how long it takes? > > It looks like the estimates of rows returned is WAY off (estimate is too > > low compared to what really comes back.) > > Also, you might try to alter the table.column to have a higher target on > > the rows p_partkey and ps_partkey and any others where the estimate is > so > far off of the reality. > > On Wed, 1 Oct 2003, Oleg Lebedev wrote: > > > All right, my query just finished running with EXPLAIN ANALYZE. I show > > > the plan below and also attached it as a file. Any ideas? > > > > -> Sort (cost=54597.49..54597.50 rows=1 width=121) (actual > > time=6674562.03..6674562.15 rows=175 loops=1) > > Sort Key: nation.n_name, date_part('year'::text, > > orders.o_orderdate) > > -> Aggregate (cost=54597.45..54597.48 rows=1 width=121) > > (actual time=6668919.41..6674522.48 rows=175 loops=1) > > -> Group (cost=54597.45..54597.47 rows=3 width=121) > > (actual time=6668872.68..6672136.96 rows=348760 loops=1) > > -> Sort (cost=54597.45..54597.46 rows=3 > > width=121) (actual time=6668872.65..6669499.95 rows=348760 loops=1) > > Sort Key: nation.n_name, > > date_part('year'::text, orders.o_orderdate) > > -> Hash Join (cost=54596.00..54597.42 > > rows=3 > > width=121) (actual time=6632768.89..6650192.67 rows=348760 loops=1) > > Hash Cond: ("outer".n_nationkey = > > "inner".s_nationkey) > > -> Seq Scan on nation > > (cost=0.00..1.25 rows=25 width=33) (actual time=6.75..7.13 rows=25 > > loops=1) > > -> Hash (cost=54596.00..54596.00 > > rows=3 > > width=88) (actual time=6632671.96..6632671.96 rows=0 loops=1) > > -> Nested Loop > > (cost=0.00..54596.00 rows=3 width=88) (actual time=482.41..6630601.46 > > rows=348760 loops=1) > > Join Filter: > > ("inner".s_suppkey = "outer".l_suppkey) > > -> Nested Loop > > (cost=0.00..54586.18 rows=3 width=80) (actual time=383.87..6594984.40 > > rows=348760 loops=1) > > -> Nested Loop > > (cost=0.00..54575.47 rows=4 width=68) (actual time=199.95..3580882.07 > > rows=348760 loops=1) > > Join Filter: > > ("outer".p_partkey = "inner".ps_partkey) > > -> Nested > > Loop (cost=0.00..22753.33 rows=9343 width=49) (actual > > time=146.85..3541433.10 rows=348760 loops=1) > > -> Seq > > > Scan on part (cost=0.00..7868.00 rows=320 width=4) (actual > > time=33.64..15651.90 rows=11637 loops=1) > > > > Filter: (p_name ~~ '%green%'::text) > > -> > > Index Scan using i_l_partkey on lineitem (cost=0.00..46.15 rows=29 > > width=45) (actual time=10.71..302.67 rows=30 loops=11637) > > > > Index > > Cond: ("outer".p_partkey = lineitem.l_partkey) > > -> Index > > Scan using pk_partsupp on partsupp (cost=0.00..3.39 rows=1 width=19) > > (actual time=0.09..0.09 rows=1 loops=348760) > > Index > > Cond: ((partsupp.ps_partkey = "outer".l_partkey) AND > > (partsupp.ps_suppkey = > > "outer".l_suppkey)) > > -> Index Scan > > using pk_orders on orders (cost=0.00..3.01 rows=1 width=12) (actual > > time=8.62..8.62 rows=1 loops=348760) > > Index Cond: > > (orders.o_orderkey = "outer".l_orderkey) > > -> Index Scan using > > pk_supplier on supplier (cost=0.00..3.01 rows=1 width=8) (actual > > time=0.08..0.08 rows=1 loops=348760) > > Index Cond: > > ("outer".ps_suppkey = supplier.s_suppkey) Total runtime: 6674724.23 > > msec (28 rows) > > > > > > -----Original Message----- > > From: Oleg Lebedev > > Sent: Wednesday, October 01, 2003 12:00 PM > > To: Josh Berkus; scott.marlowe > > Cc: pgsql-performance@postgresql.org > > Subject: Re: [PERFORM] TPC-R benchmarks > > Importance: Low > > > > > > Sure, below is the query. I attached the plan to this posting. > > > > select > > nation, > > o_year, > > sum(amount) as sum_profit > > from > > ( > > select > > n_name as nation, > > extract(year from o_orderdate) as o_year, > > l_extendedprice * (1 - l_discount) - > > ps_supplycost * l_quantity as amount > > from > > part, > > supplier, > > lineitem, > > partsupp, > > orders, > > nation > > where > > s_suppkey = l_suppkey > > and ps_suppkey = l_suppkey > > and ps_partkey = l_partkey > > and p_partkey = l_partkey > > and o_orderkey = l_orderkey > > and s_nationkey = n_nationkey > > and p_name like '%green%' > > ) as profit > > group by > > nation, > > o_year > > order by > > nation, > > o_year desc; > > > > > > -----Original Message----- > > From: Josh Berkus [mailto:josh@agliodbs.com] > > Sent: Wednesday, October 01, 2003 11:42 AM > > To: Oleg Lebedev; scott.marlowe > > Cc: pgsql-performance@postgresql.org > > Subject: Re: [PERFORM] TPC-R benchmarks > > > > > > Oleg, > > > > > The output of the query should contain about 200 rows. So, I guess > > > the > > > > > planer is off assuming that the query should return 1 row. > > > > Oh, also did you post the query before? Can you re-post it with the > > planner > > results? > > > > > > ************************************* > > This e-mail may contain privileged or confidential material intended for the named recipient only. > If you are not the named recipient, delete this message and all attachments. > Unauthorized reviewing, copying, printing, disclosing, or otherwise using information in this e-mail is prohibited. > We reserve the right to monitor e-mail sent through our network. > > ************************************* >
pgsql-performance by date: