Re: Bitmapscan changes - Mailing list pgsql-patches

From Heikki Linnakangas
Subject Re: Bitmapscan changes
Date
Msg-id 45F591B7.2080600@enterprisedb.com
Whole thread Raw
In response to Re: Bitmapscan changes  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Bitmapscan changes  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-patches
Tom Lane wrote:
> I'm really dubious that this is an intelligent way to go.  In the first
> place, how will you keep the index sorted if you can't determine the
> values of all the keys?  It certainly seems that this would break the
> ability to have a simple indexscan return sorted data, even if the index
> itself doesn't get corrupted.

That's indeed a very fundamental thing with the current design. The
index doesn't retain the complete order within heap pages. That
information is lost, again in favor of a smaller index size. It incurs a
significant CPU overhead, but on an I/O bound system that's a tradeoff
you want to make.

At the moment, I'm storing the offsets within the heap in a bitmap
attached to the index tuple. btgettuple fetches all the heap tuples
represented by the grouped index tuple, checks their visibility, sorts
them into index order, and returns them to the caller one at a time.
Thats ugly, API-wise, because it makes the indexam to actually go look
at the heap, which it shouldn't have to deal with.

Another approach I've been thinking of is to store a list of offsets, in
index order. That would avoid the problem of returning sorted data, and
reduce the CPU overhead incurred by sorting and scanning, at the cost of
much larger (but still much smaller than what we have now) index.

 > In the second place, this seems to
> forever kill the idea of indexscans that don't visit the heap --- not
> that we have any near-term prospect of doing that, but I know a lot of
> people remain interested in the idea.

I'm certainly interested in that. It's not really needed for clustered
indexes, though. A well-clustered index is roughly one level shallower,
and the heap effectively is the leaf-level, therefore the amount of I/O
you need to fetch the index tuple + heap tuple, is roughly the same that
as fetching just the index tuple from a normal b-tree index.

On non-clustered indexes, index-only scans would of course still be useful.

> The reason this catches me by surprise is that you've said several times
> that you intended GIT to be something that could just be enabled
> universally.  If it's lossy then there's a much larger argument that not
> everyone would want it.

Yeah, we can't just always enable it by default. While a clustered index
would degrade to a normal b-tree when the heap isn't clustered, you
would still not want to always enable the index clustering because of
the extra CPU overhead. That has become clear in the CPU bound tests
I've run.

I think we could still come up with some safe condiitions when we could
enable it by default, though. In particular, I've been thinking that if
you run CLUSTER on a table, you'd definitely want to use a clustered
index as well.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

pgsql-patches by date:

Previous
From: Gregory Stark
Date:
Subject: Packed varlena patch update
Next
From: Tom Lane
Date:
Subject: Re: Bitmapscan changes