Re: proposal: be smarter about i/o patterns in index scan - Mailing list pgsql-hackers

From Tom Lane
Subject Re: proposal: be smarter about i/o patterns in index scan
Date
Msg-id 19980.1084996517@sss.pgh.pa.us
Whole thread Raw
In response to Re: proposal: be smarter about i/o patterns in index scan  ("Glen Parker" <glenebob@nwlink.com>)
Responses Re: proposal: be smarter about i/o patterns in index scan  (Greg Stark <gsstark@mit.edu>)
Re: proposal: be smarter about i/o patterns in index scan  ("Jeffrey W. Baker" <jwb@gghcwest.com>)
List pgsql-hackers
"Glen Parker" <glenebob@nwlink.com> writes:
> What am I missing?  Why is a performance bottle neck of this magnitude not
> on the same list of priorities as PITR, replication, and Win32?

It's higher on my personal to-do list than most of those ;-).  But those
things are getting done because there are other developers with other
priorities.  I suspect also that the set of people competent to make
this change is much smaller than the set of people able to work on the
other points.  In my mind, most of the issue is in the planner (figuring
out what to do with an unsorted-indexscan option) and relatively few
people have wanted to touch the planner.

> Here's one answer: If you had to sort every result set, even when an index
> could have been used, overall performance would still improve by a very
> large margin.  I'd bet money on it.

For a counterexample I refer you to our standard solution for
MAX-using-an-index:
SELECT ... FROM table ORDER BY foo DESC LIMIT 1;

which would become truly spectacularly bad without an ordered index
scan.  A more general point is that for any indexscan that returns only
a small number of index entries (eg, any unique-key search) worrying
about physical-order access will be wasted effort.  The best you could
hope for is not to be significantly worse than the existing code in such
cases, and I'm unconvinced you could achieve even that.

I can assure you that any patch that completely removes the existing
behavior will be rejected, because there are plenty of cases where it's
the right thing.

The main thing that unordered indexscan could do for us is extend the
usefulness of indexscan plans into relatively-poor-selectivity cases
where we presently tend to drop back to seqscans.  There would still be
a selectivity threshold above which you might as well use seqscan, but
it ought to be higher than the fraction-of-a-percent that we currently
see for indexscans.  What is unknown, and will be unknown until someone
tries it, is just what range of selectivity this technique might win
for.  I think such a range exists; I am not real certain that it is wide
enough to justify a lot of effort in making the idea a reality.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Call for 7.5 feature completion
Next
From: "Jeffrey W. Baker"
Date:
Subject: Re: proposal: be smarter about i/o patterns in index scan