Re: Query plan changes after pg_dump / pg_restore - Mailing list pgsql-performance

From Jona
Subject Re: Query plan changes after pg_dump / pg_restore
Date
Msg-id 42A86B08.4070005@oismail.com
Whole thread Raw
In response to Re: Query plan changes after pg_dump / pg_restore  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Hi Tom,
Thank you for the input, you're absolutely right.
Have just executed like 10 VACUUM ANALYZE on the Price_Tbl in both databases and now both queries use the same plan.... the bad one, GREAT!
Who said ignorance is bliss?? ;-)

Have just messed around with ALTER TABLE ... ALTER .... SET STATISTICS .... for both tables to no effect.
Have tried setting both high number (100 and 200) and a low number (1) and run several VACUUM ANALYZE afterwards.
It still insists on the bad plan...

Furthermore I've played around with the RANDOM_PAGE_COST runtime parameter.
Seems that when I set it to 2.2 it switch to using the aff_price_uq index on Price_Tbl, however  it needs to be set to 0.7 before it uses the subcat_uq index on SCT2SubCatType_Tbl.
Has no effect wether the statistics is set to 1 or a 100 for this behaviour.
The overall plan remains the same though, and even when it uses both indexes the total cost is roughly 5.5 times higher than the good plan.

New plan:
Unique  (cost=612.29..612.65 rows=3 width=75) (actual time=255.88..255.89 rows=3 loops=1)
   ->  Hash Join  (cost=158.26..596.22 rows=288 width=75) (actual time=60.91..99.69 rows=2477 loops=1)
         Hash Cond: ("outer".sctid = "inner".sctid)
             ->  Index Scan using aff_price_uq on price_tbl  (cost=0.00..409.24 rows=5025 width=4) (actual time=0.03..17.81 rows=5157 loops=1)
                   Index Cond: (affid = 8)
                    ->  Hash  (cost=157.37..157.37 rows=355 width=71) (actual time=60.77..60.77 rows=0 loops=1)
                          ->  Merge Join  (cost=10.26..157.37 rows=355 width=71) (actual time=14.42..53.79 rows=2493 loops=1)
                                 Merge Cond: ("outer".subcattpid = "inner".id)
                                 ->  Index Scan using subcat_uq on sct2subcattype_tbl  (cost=0.00..126.28 rows=6536 width=8) (actual time=0.03..23.25 rows=6527 loops=1)
                                 ->  Sort  (cost=10.26..10.28 rows=9 width=63) (actual time=2.46..5.66 rows=2507 loops=1)

"Total runtime: 257.49 msec"

Old "good" plan:
Unique  (cost=117.18..117.20 rows=1 width=147) (actual time=224.62..224.63 rows=3 loops=1)
    ->  Index Scan using subcat_uq on sct2subcattype_tbl  (cost=0.00..100.47 rows=33 width=8) (actual time=0.01..0.20 rows=46 loops=54)
          Index Cond: ("outer".id = sct2subcattype_tbl.subcattpid)

          ->  Index Scan using aff_price_uq on price_tbl  (cost=0.00..7.11 rows=1 width=4) (actual time=0.01..0.01 rows=1 loops=2493)
                 Index Cond: ((price_tbl.affid = 8) AND ("outer".sctid = price_tbl.sctid))
Total runtime: 225.14 msec

It seems that the more it knows about

Could you provide some input on how to make it realise that the plan it selects is not the optimal?

Cheers
Jona

Tom Lane wrote:
Jona <jonanews@oismail.com> writes: 
What worries me is that the plan is different,   
Given that the estimated costs are close to the same, this is probably
just the result of small differences in the ANALYZE statistics leading
to small differences in cost estimates and thus choice of different
plans.  I'll bet if you re-ANALYZE a few times on the source database
you'll see it flipping between plan choices too.  This is normal because
ANALYZE takes a random sample of rows rather than being exhaustive.

So the interesting question is not "why are the plan choices different"
it is "how do I get the cost estimates closer to reality".  That's the
only way in the long run to ensure the planner makes the right choice.
Increasing the statistics targets or fooling with planner cost
parameters are the basic tools you have available here.
		regards, tom lane 

pgsql-performance by date:

Previous
From: Christopher Kings-Lynne
Date:
Subject: Re: Recommendations for configuring a 200 GB
Next
From: Alex Stapleton
Date:
Subject: random_page_cost = 1?