Re: Index Scans become Seq Scans after VACUUM ANALYSE - Mailing list pgsql-hackers

From mlw
Subject Re: Index Scans become Seq Scans after VACUUM ANALYSE
Date
Msg-id 3CBDA441.5F490463@mohawksoft.com
Whole thread Raw
In response to Re: Index Scans become Seq Scans after VACUUM ANALYSE  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: Index Scans become Seq Scans after VACUUM ANALYSE  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Index Scans become Seq Scans after VACUUM ANALYSE  (Andrew Sullivan <andrew@libertyrms.info>)
List pgsql-hackers
Tom Lane wrote:
> 
> mlw <markw@mohawksoft.com> writes:
> > On borderline conditions, wrongly using an index does not result in as bad
> > performance as wrongly not using an index,
> 
> You're arguing from a completely false premise.  It might be true on the
> particular cases you've looked at, but in general an indexscan-based
> plan can be many times worse than a seqscan plan.

OK, I'll grant you that, but I am talking about the space between when it is
clear that an index is useful and when it is clear that it is not. For some
reason you seem to think I am saying "always use an index," when, in fact, I am
saying more preference should be given to using an index than it currently has.

> As for "borderline conditions", how is the planner supposed to know what
> is borderline?

It need not know about borderline conditions.

> 
> I cannot see any rational justification for putting a thumb on the
> scales on the side of indexscan (or any other specific plan type)
> as you've proposed.  Thomas correctly points out that you'll just move
> the planner failures from one area to another.

I don't think this is true, and you yourself had said you are not too worried
about a 10 vs 8 second difference. I have seen many instances of when
PostgreSQL refuses to use an index because the data distribution is uneven.
Making it more difficult for the planer to ignore an index would solve
practically all the problems I have seen, and I bet the range of instances
where it would incorrectly use an index would not impact performance as badly
as those instances where it doesn't.

> 
> If we can identify a reason why the planner tends to overestimate the
> costs of indexscan vs seqscan, by all means let's fix that.  But let's
> not derive cost estimates that are the best we know how to make and
> then ignore them.

I don't think you can solve this with statistics. It is a far more complex
problem than that. There are too many variables, there is no way a standardized
summation will accurately characterize all possible tables. There must be a way
to add heuristics to the cost based analyzer.


pgsql-hackers by date:

Previous
From: Hannu Krosing
Date:
Subject: Re: Index Scans become Seq Scans after VACUUM ANALYSE
Next
From: Tom Lane
Date:
Subject: Re: Index Scans become Seq Scans after VACUUM ANALYSE