Re: Planner problem - Mailing list pgsql-performance

From Tom Lane
Subject Re: Planner problem
Date
Msg-id 27262.1086360110@sss.pgh.pa.us
Whole thread Raw
In response to Planner problem  (Mikael Kjellström <mikael.kjellstrom@mksoft.nu>)
List pgsql-performance
=?ISO-8859-1?Q?Mikael_Kjellstr=F6m?= <mikael.kjellstrom@mksoft.nu> writes:
> I am having a bit of problem with the plan that the planner produces.

Actually, your problem is with the row-count estimates.  Some of them
are pretty wildly off, which inevitably leads to bad plan choices.
In particular the price row estimate is off by a factor of 200 in all
three plans:

>   ->  Index Scan using priceix2 on price  (cost=0.00..3.41 rows=23 width=20) (actual time=0.042..25.811 rows=4402
loops=111)
>         Index Cond: ('2004-06-01'::date <= validstopdate)
>         Filter: (('2004-06-01'::date >= validstartdate) AND (deletiondate IS NULL) AND (organizationid = 1))

>   ->  Index Scan using priceix2 on price  (cost=0.00..3.41 rows=23 width=20) (actual time=0.053..26.225 rows=4402
loops=1)
>         Index Cond: ('2004-06-01'::date <= validstopdate)
>         Filter: (('2004-06-01'::date >= validstartdate) AND (deletiondate IS NULL) AND (organizationid = 1))

>         ->  Index Scan using priceix2 on price (cost=0.00..3.41 rows=23 width=20) (actual time=0.050..26.475
rows=4402loops=1) 
>               Index Cond: ('2004-06-01'::date <= validstopdate)
>               Filter: (('2004-06-01'::date >= validstartdate) AND (deletiondate IS NULL) AND (organizationid = 1))

and priceavailable is off by a factor of 100:

>   ->  Index Scan using priceavailableix1 on priceavailable pa  (cost=0.00..5.91 rows=1 width=16) (actual
time=0.067..4.182rows=111 loops=1) 
>         Index Cond: (pa.locationconnectionid = "outer".locationconnectionid)
>         Filter: (((direction = 'D'::text) OR (direction = 'B'::text)) AND (deletiondate IS NULL))

>         ->  Index Scan using priceavailableix1 on priceavailable pa  (cost=0.00..141.57 rows=43 width=16) (actual
time=0.048..48.739rows=6525 loops=1) 
>               Filter: (((direction = 'D'::text) OR (direction = 'B'::text)) AND (deletiondate IS NULL))

>       ->  Index Scan using priceavailableix1 on priceavailable pa  (cost=0.00..5.91 rows=1 width=16) (actual
time=0.070..0.965rows=111 loops=1) 
>             Index Cond: (pa.locationconnectionid = "outer".locationconnectionid)
>             Filter: (((direction = 'D'::text) OR (direction = 'B'::text)) AND (deletiondate IS NULL))

Are you sure you've vacuum analyzed these two tables recently?  If so,
what may be needed is to increase ANALYZE's statistics target for
the columns used in the conditions.  (See ALTER TABLE SET STATISTICS)

I suspect that part of the story here has to do with cross-column
correlations, which the present planner will never figure out since it
has no cross-column statistics.  But it's hard to believe that that's
the problem for cases as simple as

>               Filter: (((direction = 'D'::text) OR (direction = 'B'::text)) AND (deletiondate IS NULL))

            regards, tom lane

pgsql-performance by date:

Previous
From: Mikael Kjellström
Date:
Subject: Planner problem
Next
From: CH
Date:
Subject: Re: filesystem option tuning