Re: Optimizer Not using the Right plan - Mailing list pgsql-performance

From galy lee
Subject Re: Optimizer Not using the Right plan
Date
Msg-id 58acf4b70712050449h5be72805la82fcc5905f0a133@mail.gmail.com
Whole thread Raw
In response to Optimizer Not using the Right plan  (Pallav Kalva <pkalva@livedatagroup.com>)
List pgsql-performance
 
I think you may increase the row number that you want to limit, like LIMIT 50.
 
LIMIT can change the cost of a plan dramatically. Looking in your SQL:
 
   where this_.fkaddressid= 6664161
   order by this_.addressvaluationid desc limit 1;
 
Planner may use either index1(this_.fkaddressid) or index2(this_.addressvaluationid) to scan the table. Although it is obvious that using index2 is very expensive, but because you are trying to limit one row from 1304 row, so the cost of using index2 changes to  

   883328/1304=677.69

The cost of using index1 should be lager than 1304, so planner chooses index2.

Planner tends to choose a plan which has small startup cost when you are trying to LIMIT a small portion of data over a large data set. It seems that the following issue also comes from the same root.

  http://archives.postgresql.org/pgsql-performance/2007-11/msg00395.php

Best Regards
Galy Lee
 
Tom Lane wrote:
Pallav Kalva <pkalva ( at ) livedatagroup ( dot ) com> writes:
   why does it have different plans for different values
Because the values occur different numbers of times (or so it thinks
anyway).  If the rowcount estimates are far from reality, perhaps
increasing the statistics target would help.  However, since you
only showed EXPLAIN and not EXPLAIN ANALYZE output, no one can
really tell whether the optimizer did anything wrong here.
		regards, tom lane
Hi Tom,

Thanks! for your reply, here is an another example of the same query with different addressid now. This time I got the explain analyze on the query, this query also uses the Index Scan Backwards, it says it took 28 seconds but I can say that after looking at the postgres logs it took more than 2 min when the query first ran. I ran this one again to get the explain analyze.
 The statistics set to "default_statistics_target = 100"

I am sure if it uses index on addressid it would be quicker but for some reason it using index backward scan on addressvaluationid and that is taking too long.
 Not only this one there are some other queries which use index scan backwards scan and it takes too long. Index scan backwards most of the time is not doing good for me is there any way to avoid it ? 
explain analyze
select this_.addressvaluationid as addressv1_150_1_, this_.sourcereference as sourcere2_150_1_, this_.createdate as createdate150_1_, this_.valuationdate as valuatio4_150_1_, this_.valuationamount as valuatio5_150_1_, this_.valuationhigh as valuatio6_150_1_, this_.valuationlow as valuatio7_150_1_, this_.valuationconfidence as valuatio8_150_1_, this_.valuationchange as valuatio9_150_1_, this_.historycharturl as history10_150_1_, this_.regionhistorycharturl as regionh11_150_1_, this_.fkaddressid as fkaddre12_150_1_, this_.fkaddressvaluationsourceid as fkaddre13_150_1_, this_.fkvaluationchangeperiodid as fkvalua14_150_1_, valuationc2_.valuationchangeperiodid as valuatio1_197_0_,
         valuationc2_.name as name197_0_
from listing.addressvaluation this_ left outer join listing.valuationchangeperiod valuationc2_ on this_.fkvaluationchangeperiodid=valuationc2_.valuationchangeperiodid
where this_.fkaddressid= 6664161
order by this_.addressvaluationid desc limit 1;
QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=0.00..677.69 rows=1 width=494) (actual time=28454.708..28454.712 rows=1 loops=1) -> Nested Loop Left Join (cost=0.00..883705.44 rows=1304 width=494) (actual time=28454.700..28454.700 rows=1 loops=1) -> Index Scan Backward using pk_addressvaluation_addressvaluationid on addressvaluation this_ (cost=0.00..883328.22 rows=1304 width=482) (actual time=28441.236..28441.236 rows=1 loops=1)
              Filter: (fkaddressid = 6664161)
-> Index Scan using pk_valuationchangeperiod_valuationchangeperiodid on valuationchangeperiod valuationc2_ (cost=0.00..0.28 rows=1 width=12) (actual time=13.447..13.447 rows=1 loops=1) Index Cond: (this_.fkvaluationchangeperiodid = valuationc2_.valuationchangeperiodid)
Total runtime: 28454.789 ms
(7 rows)


pgsql-performance by date:

Previous
From: Julian Mehnle
Date:
Subject: Re: Bad query plans for queries on partitioned table
Next
From: Bill Moran
Date:
Subject: Re: Evaluation of PG performance vs MSDE/MSSQL 2000 (not 2005)