Re: Delaying the planning of unnamed statements until Bind - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Delaying the planning of unnamed statements until Bind
Date
Msg-id 877jv5qa1l.fsf@stark.xeocode.com
Whole thread Raw
In response to Re: Delaying the planning of unnamed statements until Bind  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Delaying the planning of unnamed statements until Bind
List pgsql-hackers
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> > If the planner had the expected value as well as the variance of the cost
> > distribution then it might realize that in this case for instance that the
> > penalty for guessing wrong with an index scan is only going to be a small
> > slowdown factor, perhaps 2-4x slower. Whereas the penalty for guessing wrong
> > with a sequential scan could be a factor in the thousands or more.
> 
> Au contraire --- a full-table index scan can be vastly slower than a
> full-table seqscan.  

Sure, but how much slower? 10x? 20? 100x? 1,000x? The old rule of thumb was
that the break-even point was somewhere around 15% selectivity which means it
would be about 7x slower. But whatever it is, it's going to be some
substantial but still linear slowdown. And the query will be slow but still
usable.

A mistaken sequential scan used when an index scan was needed could be
millions of times slower. The sequential scan time would have no relationship
at all with the index scan time with no upper bound on the slowdown at all.
In an OLTP environment the consequence of guessing wrong in this direction
would make the difference between the query working and it effectively failing
to work at all.

> I think it's wishful thinking to assume that picking an indexscan is the
> right thing when we don't know any better.

If we don't know any better then any solution is going to be wishful thinking.
It's kind of like a bridge game. If you don't know where the card is but you
need it somewhere in order to win, then you have to assume it's there and hope
for the best. If the index is wrong then the query was going to be slow either
way. If the index was right and you chose not to use it you're risking making
it slow unnecessarily and potentially when it was absoluetely required to be
fast.

As further evidence I'll mention Oracle falls back to the Rules-Based
optimizer if it has no statistics. The Rules-Based optimizer -- which was the
ONLY optimizer it used at all for many years -- always uses an index if it
can.


-- 
greg



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Delaying the planning of unnamed statements until Bind
Next
From: Greg Stark
Date:
Subject: Re: Delaying the planning of unnamed statements until Bind