Re: strange estimate for number of rows - Mailing list pgsql-performance

From Tom Lane
Subject Re: strange estimate for number of rows
Date
Msg-id 5882.1068749786@sss.pgh.pa.us
Whole thread Raw
In response to strange estimate for number of rows  (Andrew Sullivan <andrew@libertyrms.info>)
Responses Re: strange estimate for number of rows
List pgsql-performance
Andrew Sullivan <andrew@libertyrms.info> writes:
> The statistics on transaction_date and product_id are set to 1000.
> Everything is all analysed nicely.  But I'm getting a poor plan,
> because of an estimate that the number of rows to be returned is
> about double how many actually are:

> explain analyse select * from transactions_posted where
> transaction_date >= '2003-9-1' and transaction_date < '2003-10-1' and
> product_id = 2;

Are the estimates accurate for queries on the two columns individually,
ie
... where transaction_date >= '2003-9-1' and transaction_date < '2003-10-1'
... where product_id = 2

If so, the problem is that there's a correlation between
transaction_date and product_id, which the system cannot model because
it has no multi-column statistics.

However, given that the estimate is only off by about a factor of 2,
you'd still be getting the wrong plan even if the estimate were perfect,
because the estimated costs differ by nearly a factor of 3.

Given the actual runtimes, I'm thinking maybe you want to reduce
random_page_cost.  What are you using for that now?

            regards, tom lane

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Union+group by planner estimates way off?
Next
From: "Arthur Ward"
Date:
Subject: Re: Union+group by planner estimates way off?