Re: Query plan changes after pg_dump / pg_restore - Mailing list pgsql-performance
From | Jona |
---|---|
Subject | Re: Query plan changes after pg_dump / pg_restore |
Date | |
Msg-id | 42A86B08.4070005@oismail.com Whole thread Raw |
In response to | Re: Query plan changes after pg_dump / pg_restore (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-performance |
Hi Tom,
Thank you for the input, you're absolutely right.
Have just executed like 10 VACUUM ANALYZE on the Price_Tbl in both databases and now both queries use the same plan.... the bad one, GREAT!
Who said ignorance is bliss?? ;-)
Have just messed around with ALTER TABLE ... ALTER .... SET STATISTICS .... for both tables to no effect.
Have tried setting both high number (100 and 200) and a low number (1) and run several VACUUM ANALYZE afterwards.
It still insists on the bad plan...
Furthermore I've played around with the RANDOM_PAGE_COST runtime parameter.
Seems that when I set it to 2.2 it switch to using the aff_price_uq index on Price_Tbl, however it needs to be set to 0.7 before it uses the subcat_uq index on SCT2SubCatType_Tbl.
Has no effect wether the statistics is set to 1 or a 100 for this behaviour.
The overall plan remains the same though, and even when it uses both indexes the total cost is roughly 5.5 times higher than the good plan.
New plan:
Unique (cost=612.29..612.65 rows=3 width=75) (actual time=255.88..255.89 rows=3 loops=1)
-> Hash Join (cost=158.26..596.22 rows=288 width=75) (actual time=60.91..99.69 rows=2477 loops=1)
Hash Cond: ("outer".sctid = "inner".sctid)
-> Index Scan using aff_price_uq on price_tbl (cost=0.00..409.24 rows=5025 width=4) (actual time=0.03..17.81 rows=5157 loops=1)
Index Cond: (affid = 8)
-> Hash (cost=157.37..157.37 rows=355 width=71) (actual time=60.77..60.77 rows=0 loops=1)
-> Merge Join (cost=10.26..157.37 rows=355 width=71) (actual time=14.42..53.79 rows=2493 loops=1)
Merge Cond: ("outer".subcattpid = "inner".id)
-> Index Scan using subcat_uq on sct2subcattype_tbl (cost=0.00..126.28 rows=6536 width=8) (actual time=0.03..23.25 rows=6527 loops=1)
-> Sort (cost=10.26..10.28 rows=9 width=63) (actual time=2.46..5.66 rows=2507 loops=1)
"Total runtime: 257.49 msec"
Old "good" plan:
Unique (cost=117.18..117.20 rows=1 width=147) (actual time=224.62..224.63 rows=3 loops=1)
Total runtime: 225.14 msec
It seems that the more it knows about
Could you provide some input on how to make it realise that the plan it selects is not the optimal?
Cheers
Jona
Tom Lane wrote:
Thank you for the input, you're absolutely right.
Have just executed like 10 VACUUM ANALYZE on the Price_Tbl in both databases and now both queries use the same plan.... the bad one, GREAT!
Who said ignorance is bliss?? ;-)
Have just messed around with ALTER TABLE ... ALTER .... SET STATISTICS .... for both tables to no effect.
Have tried setting both high number (100 and 200) and a low number (1) and run several VACUUM ANALYZE afterwards.
It still insists on the bad plan...
Furthermore I've played around with the RANDOM_PAGE_COST runtime parameter.
Seems that when I set it to 2.2 it switch to using the aff_price_uq index on Price_Tbl, however it needs to be set to 0.7 before it uses the subcat_uq index on SCT2SubCatType_Tbl.
Has no effect wether the statistics is set to 1 or a 100 for this behaviour.
The overall plan remains the same though, and even when it uses both indexes the total cost is roughly 5.5 times higher than the good plan.
New plan:
Unique (cost=612.29..612.65 rows=3 width=75) (actual time=255.88..255.89 rows=3 loops=1)
-> Hash Join (cost=158.26..596.22 rows=288 width=75) (actual time=60.91..99.69 rows=2477 loops=1)
Hash Cond: ("outer".sctid = "inner".sctid)
-> Index Scan using aff_price_uq on price_tbl (cost=0.00..409.24 rows=5025 width=4) (actual time=0.03..17.81 rows=5157 loops=1)
Index Cond: (affid = 8)
-> Hash (cost=157.37..157.37 rows=355 width=71) (actual time=60.77..60.77 rows=0 loops=1)
-> Merge Join (cost=10.26..157.37 rows=355 width=71) (actual time=14.42..53.79 rows=2493 loops=1)
Merge Cond: ("outer".subcattpid = "inner".id)
-> Index Scan using subcat_uq on sct2subcattype_tbl (cost=0.00..126.28 rows=6536 width=8) (actual time=0.03..23.25 rows=6527 loops=1)
-> Sort (cost=10.26..10.28 rows=9 width=63) (actual time=2.46..5.66 rows=2507 loops=1)
"Total runtime: 257.49 msec"
Old "good" plan:
Unique (cost=117.18..117.20 rows=1 width=147) (actual time=224.62..224.63 rows=3 loops=1)
-> Index Scan using subcat_uq on sct2subcattype_tbl (cost=0.00..100.47 rows=33 width=8) (actual time=0.01..0.20 rows=46 loops=54) | |||||||||||||||
Index Cond: ("outer".id = sct2subcattype_tbl.subcattpid) | |||||||||||||||
-> Index Scan using aff_price_uq on price_tbl (cost=0.00..7.11 rows=1 width=4) (actual time=0.01..0.01 rows=1 loops=2493) | |||||||||||||||
Index Cond: ((price_tbl.affid = 8) AND ("outer".sctid = price_tbl.sctid)) |
It seems that the more it knows about
Could you provide some input on how to make it realise that the plan it selects is not the optimal?
Cheers
Jona
Tom Lane wrote:
Jona <jonanews@oismail.com> writes:What worries me is that the plan is different,Given that the estimated costs are close to the same, this is probably just the result of small differences in the ANALYZE statistics leading to small differences in cost estimates and thus choice of different plans. I'll bet if you re-ANALYZE a few times on the source database you'll see it flipping between plan choices too. This is normal because ANALYZE takes a random sample of rows rather than being exhaustive. So the interesting question is not "why are the plan choices different" it is "how do I get the cost estimates closer to reality". That's the only way in the long run to ensure the planner makes the right choice. Increasing the statistics targets or fooling with planner cost parameters are the basic tools you have available here. regards, tom lane
pgsql-performance by date: