On Wed, 9 Mar 2005, Michael McFarland wrote:
> I continue to be stumped by this. You are right that I should have
> listed the estimates provided by explain... basically for the select where
> bar = 41, it's estimating there will be 40,000 rows instead of 7, out of
> what's actuallly 5 million records in the table.
>
> So far I've tried increase statistics for the bar column from the
> default 10 to 100 (vacuum analyzing after) and the explain-plan hasn't
> changed. I also notice that afterward, the pg_stats record for the bar
Did the estimates change at all?
> column still only lists the top 5 values of bar (out of 68 unique values
> in the table). Are there any other settings I could try to improve the
> detail of the statistics?
Well, I'd first try moving up to a statistic target of 1000 in
order to try sampling a greater number of rows. I'd wonder if there's
enough difference in frequency that it's just not visiting any with the
other values. I'm not sure that it'll help that much though; hopefully
someone else will have an idea.
> By the way, I think I do have a workaround for this particular query:
> select * from (select * from foo where barId = 412 order by id
> desc) as tempview limit 25;
> This query uses the bar index and completes instantly. However, I feel
> like I should find the heart of the problem, since bad statistics could
> end up affecting other plans, right?
Yeah, it's best to get it to estimate somewhat reasonably before looking
for workarounds.