On 11/12/2010 09:17 AM, Bruce Momjian wrote:
> Greg Stark wrote:
>> On Wed, Nov 10, 2010 at 4:04 AM, Bruce Momjian<bruce@momjian.us> wrote:
>>> We last researched index-only scans, also called covering indexes, in
>>> September of 2008, but have made little progress on it since. ?Many have
>>> been waiting for Heikki to implement this but I talked to him and he
>>> doesn't have time.
>>>
>>> I believe it is time for the community to move forward and I would like
>>> to assemble a team to work on this feature. ?We might not be able to
>>> implement it for Postgres 9.1, but hopefully we can make some progress
>>> on this.
>> Just so everyone is on the same page.... Even once we have index-only
>> scans they won't be anywhere near as useful with Postgres as they are
>> with Oracle and other databases. At least not unless we find a
>> solution for a different problem -- our inability to scan btree
>> indexes sequentially.
>>
>> In Oracle "Fast Full Index" scans are particularly useful for things
>> like unconstrained select count(*). Since the scan can scan through
>> the index sequentially and the index is much smaller than the table it
>> can count all the values fairly quickly even on a very wide table.
>>
>> In Postgres, aside from the visibility issues we have a separate
>> problem. In order to achieve high concurrency we allow splits to occur
>> without locking the index. And the new pages can be found anywhere in
>> the index, even to the left of the existing page. So a sequential scan
>> could miss some data if the page it's on is split and some of the data
>> is moved to be to the left of where our scan is.
>>
>> It's possible this is a non-issue in the future due to large RAM sizes
>> and SSDs. Large amounts of RAM mean perhaps indexes will be in memory
>> much of the time and SSDs might mean that scanning the btree in index
>> order might not really be that bad.
> Agreed. I updated the index-only scans wiki for this:
>
> http://wiki.postgresql.org/wiki/Index-only_scans
>
> test speed improvement for scans of the entire index (this involves
> random I/O)
> * we can't scan the index in physical order like vacuum does
For unconstrained select count(*), why does scanning in index order matter?
cheers
andrew