Re: We need index-only scans - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: We need index-only scans |
Date | |
Msg-id | AANLkTinvg6abZKbxz1dVfewyuv+No4VrD3h6Mt72O+_D@mail.gmail.com Whole thread Raw |
In response to | Re: We need index-only scans (Greg Stark <gsstark@mit.edu>) |
List | pgsql-hackers |
On Fri, Nov 12, 2010 at 8:33 AM, Greg Stark <gsstark@mit.edu> wrote: > On Wed, Nov 10, 2010 at 4:04 AM, Bruce Momjian <bruce@momjian.us> wrote: >> We last researched index-only scans, also called covering indexes, in >> September of 2008, but have made little progress on it since. Many have >> been waiting for Heikki to implement this but I talked to him and he >> doesn't have time. >> >> I believe it is time for the community to move forward and I would like >> to assemble a team to work on this feature. We might not be able to >> implement it for Postgres 9.1, but hopefully we can make some progress >> on this. > > Just so everyone is on the same page.... Even once we have index-only > scans they won't be anywhere near as useful with Postgres as they are > with Oracle and other databases. At least not unless we find a > solution for a different problem -- our inability to scan btree > indexes sequentially. I have very little doubt that our first attempts to chip away at this problem are going to be a bit rough around the edges. Here's another problem to mull over: a large insert-only table will never be vacuumed; therefore, the visibility map bits will never become set; therefore, the index-only scan optimization won't apply (and the user may not realize it or understand why it's happening). But the journey of a thousand miles begins with the first step. I think we need to focus our first effort on making the visibility map crash-safe. Then we can implement the basic feature, which I would characterize this way: if performing an index-scan, and all the attributes we need are available from the index tuple, then skip the heap fetch when the visibility map bit is set. This requires minimal planner support - just an adjustment of the costing model for index scans; although to do it right I think we're going to need statistics on what fraction of pages in the heap have the visibility map bit set.Then, we can work on refinements, of which I thinkthere will be many, including the one you listed. Another is to bubble up heap fetches in the plan tree - so for example if you eventually need to return some attributes that aren't in the index tuple, you could consider performing some other join based on the index columns and then do the heap fetches for the remaining attributes (and visibility checks) later. I am not confident that we can get even a basic implementation of index-only scans into 9.1 at this point, and we're certainly not going to get all the kinks worked out. So I agree with you that we shouldn't set expectations above the level at which they can be met, but, I'd be happy if we can make a start on it. ...Robert
pgsql-hackers by date: