Thread: Question about indexes

Question about indexes

From
Greg Stark
Date:
How feasible would it be to have a btree index on ctid? I'm thinking it ought
to work simply enough for the normal case of insert/delet/update, but I'm not
completely certain how vacuum, vacuum full, and cluster would interact.

You may think this would be utterly useless, but I have a cunning plan.

-- 
greg



Re: Question about indexes

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> How feasible would it be to have a btree index on ctid?

Why would you want one?  Direct access by ctid beats out an index lookup
every time.  In any case, vacuum and friends would break such an index
entirely.
        regards, tom lane


Re: Question about indexes

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
>
> > How feasible would it be to have a btree index on ctid?
> 
> Why would you want one?  Direct access by ctid beats out an index lookup
> every time.  

Of course. But as I mentioned, I have a cunning plan.

If you have two indexes (a,ctid) and (b,ctid) and do a query where a=1 and b=2
then it would be particularly easy to combine the two efficiently. 

If specially marked btree indexes -- or even all btree indexes -- implicitly
had ctid as a final sort order after all the index column, then it would
esentially obviate the need for bitmap indexes. They wouldn't have the space
advantage, but they would be possible to combine using arbitrary boolean
expressions without looking at the actual tuples.

This is essentially what is in the TODO about using bitmaps, but without
having to do any extra sorts.

This would only really be an advantage for particularly wide tables where the
combination of boolean clauses narrows the result set down a lot more than any
one clause.

> In any case, vacuum and friends would break such an index entirely.

That was what I was afraid of.

-- 
greg



Re: Question about indexes

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> If you have two indexes (a,ctid) and (b,ctid) and do a query where a=1 and b=2
> then it would be particularly easy to combine the two efficiently. 

> If specially marked btree indexes -- or even all btree indexes -- implicitly
> had ctid as a final sort order after all the index column, then it would
> esentially obviate the need for bitmap indexes.

I don't think so.  You are thinking only of exact-equality queries ---
as soon as the WHERE clause describes a range of index entries, the
readout wouldn't be sorted by ctid anyway.

Combining indexes via a bitmap intermediate step (which is not really
the same thing as bitmap indexes, IIUC) seems like a more robust
approach than relying on the index entries to be in ctid order.

But if we did want to sort indexes that way, we could do it today,
I think.  The ctid is already stored in index entries (it is the
"payload" remember...) and we could use it as a tiebreaker when
determining insertion position.  This doesn't have the problems that
putting ctid into the user columns would do, because the system knows
about that ctid as being special; the difficulty with ctid in the user
columns is the code not knowing that it'd need to change on a tuple move.
        regards, tom lane


Re: Question about indexes

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> I don't think so.  You are thinking only of exact-equality queries ---
> as soon as the WHERE clause describes a range of index entries, the
> readout wouldn't be sorted by ctid anyway.

But then even bitmap indexes would fail in that way too, or at least have a
lot of extra cost that would have to be taken into account based on the number
of values in the range.

> Combining indexes via a bitmap intermediate step (which is not really
> the same thing as bitmap indexes, IIUC) seems like a more robust
> approach than relying on the index entries to be in ctid order.

I would see that as the next step, But it seems to me it would be only a small
set of queries where it would really help enough to outweigh the extra work of
the sort. Whereas if the ctid is already pre-sorted then the extra cost is
fairly low. Sort of like the difference in cost between a merge join where
both sides have to be sorted and a merge join where both sides are pre-sorted.

> But if we did want to sort indexes that way, we could do it today,
> I think.  The ctid is already stored in index entries (it is the
> "payload" remember...) and we could use it as a tiebreaker when
> determining insertion position. This doesn't have the problems that
> putting ctid into the user columns would do, because the system knows
> about that ctid as being special; the difficulty with ctid in the user
> columns is the code not knowing that it'd need to change on a tuple move.

That's exactly what I was thinking. I just don't know how badly it would
complicate the vacuum{,full}/cluster code and whether those are the only cases
to worry about.


Note that the space saving of bitmap indexes is still a substantial factor.
Using btree indexes the i/o costs of doing multiple index scans plus a table
scan of the relevant pages would still be quite substantial. So this doesn't
completely obviate the need for bitmap indexes, but I think it would remove a
lot of the pressure from people who just need them to handle a few select
queries.

-- 
greg



Re: Question about indexes

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
>> Combining indexes via a bitmap intermediate step (which is not really
>> the same thing as bitmap indexes, IIUC) seems like a more robust
>> approach than relying on the index entries to be in ctid order.

> I would see that as the next step, But it seems to me it would be only a small
> set of queries where it would really help enough to outweigh the extra work of
> the sort.

What sort?  The whole point of a bitmap is that it makes it easy to
visit the tuples in heap order.  You scan the index, you set the
appropriate bits in the bitmap, and then you scan the bitmap and go to
the heap tuples that have their bits set.  If you are using multiple
indexes you can AND or OR their results at the bitmap phase before you
go to the heap.

An implementation of this kind would not produce tuples in index order,
so if you have an ORDER BY to satisfy then you end up doing an explicit
sort after you have the tuples.  It would be up to the planner to
consider this cost versus the advantages of being able to use multiple
indexes; we'd certainly want to keep the existing scan mechanism as an
available alternative.  But if the query is suited to multiple indexes
I suspect it'd be a win pretty often.

> Note that the space saving of bitmap indexes is still a substantial factor.

I think you are still confusing what I'm talking about with a bitmap
index, ie, a persistent structure on-disk.  It's not that at all, but
a transient structure built in-memory during an index scan.

I'm a little dubious that true bitmap indexes would be worth building
for Postgres.  Seems like partial indexes cover the same sorts of
applications and are more flexible.
        regards, tom lane


Re: Question about indexes

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> In any case, this discussion is predicated on the assumption that the
> operations involving the bitmap are a significant fraction of the total
> time, which I think is quite uncertain.  Until we build it and profile
> it, we won't know that.

The other thought I had was that it would be difficult to tell when to follow
this path. Since the main case where it wins is when the individual indexes
aren't very selective but the combination is very selective, and we don't have
inter-column correlation statistics ...

-- 
greg



Re: Question about indexes

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> >
> > I would see that as the next step, But it seems to me it would be only a small
> > set of queries where it would really help enough to outweigh the extra work of
> > the sort.
> 
> What sort?  

To build the in-memory bitmap you effectively have to do a sort. If the tuples
come out of the index in heap order then you can combine them without having
to go through that step.

> I'm a little dubious that true bitmap indexes would be worth building
> for Postgres.  Seems like partial indexes cover the same sorts of
> applications and are more flexible.

I'm clear on the distinction. I think bitmap indexes still have a place, but
if regular btree indexes could be combined efficiently then that would be an
even narrower niche.

Partial indexes are very handy, and they're useful in corner cases where
bitmap indexes are useful, such as flags for special types of records.

But I think bitmap indexes are specifically wanted by certain types of data
warehousing applications where you have an index on virtually every column and
then want to do arbitrary boolean combinations of all of them. btree indexes
would generate more i/o scanning all the indexes than just doing a sequential
scan would. Whereas bitmap indexes are much denser on disk.

However my experience leans more towards the OLTP side and I very rarely saw
applications like this.



-- 
greg



Re: Question about indexes

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> What sort?  

> To build the in-memory bitmap you effectively have to do a sort.

Hm, you're thinking that the operation of inserting a bit into a bitmap
has to be at least O(log N).  Seems to me that that depends on the data
structure you use.  In principle it could be O(1), if you use a true
bitmap (linear array) -- just index and set the bit.  You might be right
that practical data structures would be O(log N), but I'm not totally
convinced.

> If the tuples come out of the index in heap order then you can combine
> them without having to go through that step.

But considering the restrictions implied by that assumption --- no range
scans, no non-btree indexes --- I doubt we will take the trouble to
implement that variant.  We'll want to do the generalized bitmap code
anyway.

In any case, this discussion is predicated on the assumption that the
operations involving the bitmap are a significant fraction of the total
time, which I think is quite uncertain.  Until we build it and profile
it, we won't know that.
        regards, tom lane


Re: Question about indexes

From
"Simon Riggs"
Date:
Some potentially helpful background comments on the discussion so far...

>Tom Lane writes
>>Greg Stark writes
>> Note that the space saving of bitmap indexes is still a substantial 
>> factor.
>I think you are still confusing what I'm talking about with a bitmap
index, >ie, a persistent structure on-disk.  It's not that at all, but a
transient >structure built in-memory during an index scan.

Oracle allows the creation of bitmap indices as persistent data
structures. 

The "space saving" of bitmap indices is only a saving when compared with
btree indices. If you don't have them at all because they are built
dynamically when required, as Tom is suggesting, then you "save" even
more space. 

Maintaining the bitmap index is a costly operation. You tend to want to
build them on "characteristic" columns, of which there tends to be more
of in a database than "partial/full identity" columns on which you build
btrees (forgive the vagueness of that comment), so you end up with loads
of the damn things, so the space soon adds up. It can be hard to judge
which ones are the important ones, especially when each is used by a
different user/group. Building them dynamically is a good way of solving
the question "which ones are needed?". Ever seen 58 indices on a table?
Don't go there.

My vote would be implement the dynamic building capability, then return
to implement a persisted structure later if that seems like it would be
a further improvement. [The option would be nice]

If we do it dynamically, as Tom suggests, then we don't have to code the
index maintenance logic at all and the functionality will be with us all
the sooner. Go Tom!

>Tom Lane writes
> In any case, this discussion is predicated on the assumption that the
> operations involving the bitmap are a significant fraction of the
total
> time, which I think is quite uncertain.  Until we build it and profile
> it, we won't know that.

Dynamically building the bitmaps has been the strategy in use by
Teradata for nearly a decade on many large datawarehouses. I can
personally vouch for the effectiveness of this approach - I was
surprised when Oracle went for the persistent option. Certainly in that
case building the bitmaps adds much less time than is saved overall by
the better total query strategy.

>Greg Stark writes
> > To build the in-memory bitmap you effectively have to do a sort.

Not sure on this latter point: I think I agree with Greg on that point,
but want to believe Tom because requiring a sort will definitely add
time. 

To shed some light in this area, some other major implementations are:

In Teradata, tables are stored based upon a primary index, which is
effectively an index-organised table. The index pointers are stored in
sorted order lock step with the blocks of the associated table - No sort
required. (The ordering is based upon a hashed index, but that doesn't
change the technique).

Oracle's tables/indexes use heaps/btrees also, though they do provide an
index-organised table feature similar to Teradata. Maybe the lack of
heap/btree consistent ordering in Oracle and their subsequent design
choice of persistent bitmap indices is an indication for PostgreSQL too?

In Oracle, bitmap indices are an important precursor to the star join
technique. AFAICS it is still possible to have a star join plan without
having persistent bitmap indices. IMHO, the longer term goal of a good
star join plan is an important one - that may influence the design
selection for this discussion.

Hope some of that helps,

Best regards, Simon Riggs



Re: Question about indexes

From
Date:
A small comment on Oracle's implementation of persistent bitmap indexes:

Oracle's bitmap index is concurently locked by DML, i.e. it suites for OLAP
(basically read only data warehouses) but in no way for OLTP.

IMHO,
Laimis

> Maybe the lack of heap/btree consistent ordering in Oracle
> and their subsequent design choice of persistent bitmap
> indices is an indication for PostgreSQL too?



Re: Question about indexes

From
Greg Stark
Date:
<lnd@hnit.is> writes:

> A small comment on Oracle's implementation of persistent bitmap indexes:
> 
> Oracle's bitmap index is concurently locked by DML, i.e. it suites for OLAP
> (basically read only data warehouses) but in no way for OLTP. 

I knew this. I think they figured that was ok because bitmap indexes were
mainly intended to solve data warehouse problems anyways.

Thinking out loud here, I wonder whether this would be less of a problem for
postgres. Since tuples are never updated in place there would never be a need
to lock the entire bitmap until a transaction completes.

There would never be as much concurrency as btrees, assuming there was any
kind of compression on the bitmap, but I don't see any reason why a long-term
lock would have to be held for updates.

Even regular vacuum might not have to lock anything for long, just long enough
to clear the bits. and vacuum full/cluster already take table locks anyways.

I think the problem Oracle ran into was that storing rollback ids in the
bitmap is untenable. The whole point of persistent bitmap indexes is to store
a very dense representation that represents thousands of records per page.
Allocating space to store thousands of pending transaction ids and having
thousands of old versions of the page in the rollback segment would defeat the
purpose.

-- 
greg



Re: Question about indexes

From
Bruce Momjian
Date:
Greg Stark wrote:
> 
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> 
> > In any case, this discussion is predicated on the assumption that the
> > operations involving the bitmap are a significant fraction of the total
> > time, which I think is quite uncertain.  Until we build it and profile
> > it, we won't know that.
> 
> The other thought I had was that it would be difficult to tell when to follow
> this path. Since the main case where it wins is when the individual indexes
> aren't very selective but the combination is very selective, and we don't have
> inter-column correlation statistics ...

I like the idea of building in-memory bitmapped indexes.

In your example, if you are restricting on A and B, and have no A,B
index but an A index and B index, why wouldn't you always create an
in-memory bitmapped index from indexes A and B, unless index A hits only
a few rows.  In fact, from the optimizer statistics, you can guess on
how many bits you will hit from index A and index B, so we only have to
decide if it is better to take the more restrictive index and do heap
lookups for those, or scan the second index and then hit the heap.  The
only thing A,B combined statistics would tell you is how many heap
matches you will find.  The time to scan A and B indexes and create the
bitmap is already guessable from the single column statistics.

Also, what does an in-memory bitmapped index look like?  Is it:
value:  bitmap...value:  bitmap...

with the values organized in a btree fashion?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Question about indexes

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Also, what does an in-memory bitmapped index look like?

One idea that might work: a binary search tree in which each node
represents a single page of the table, and contains a bit array with
one bit for each possible item number on the page.  You could not need
more than BLCKSZ/(sizeof(HeapTupleHeaderData)+sizeof(ItemIdData)) bits
in a node, or about 36 bytes at default BLCKSZ --- for most tables you
could probably prove it would be a great deal less.  You only allocate
nodes for pages that have at least one interesting row.

I think this would represent a reasonable compromise between size and
insertion speed.  It would only get large if the indexscan output
demanded visiting many different pages --- but at some point you could
abandon index usage and do a sequential scan, so I think that property
is okay.

A variant is to make the per-page bit arrays be entries in a hash table
with page number as hash key.  This would reduce insertion to a nearly
constant-time operation, but the drawback is that you'd need an explicit
sort at the end to put the per-page entries into page number order
before you scan 'em.  You might come out ahead anyway, not sure.

Or we could try a true linear bitmap (indexed by page number times
max-items-per-page plus item number) that's compressed in some fashion,
probably just by eliminating large runs of zeroes.  The difficulty here
is that inserting a new one-bit could be pretty expensive, and we need
it to be cheap.

Perhaps someone can come up with other better ideas ...
        regards, tom lane


Re: Question about indexes

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Also, what does an in-memory bitmapped index look like?
> 
> One idea that might work: a binary search tree in which each node
> represents a single page of the table, and contains a bit array with
> one bit for each possible item number on the page.  You could not need
> more than BLCKSZ/(sizeof(HeapTupleHeaderData)+sizeof(ItemIdData)) bits
> in a node, or about 36 bytes at default BLCKSZ --- for most tables you
> could probably prove it would be a great deal less.  You only allocate
> nodes for pages that have at least one interesting row.

Actually, I think I made a mistake.  I was wondering what on-disk
bitmapped indexes look like.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Question about indexes

From
Bruce Momjian
Date:
Greg Stark wrote:
> 
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> 
> > In any case, this discussion is predicated on the assumption that the
> > operations involving the bitmap are a significant fraction of the total
> > time, which I think is quite uncertain.  Until we build it and profile
> > it, we won't know that.
> 
> The other thought I had was that it would be difficult to tell when to follow
> this path. Since the main case where it wins is when the individual indexes
> aren't very selective but the combination is very selective, and we don't have
> inter-column correlation statistics ...

We actually have heap access cost and index access cost.  You could
compare costs of looking at all of index A's heap vs. looking at index
B and then hopefully fewer heap rows.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Question about indexes

From
Alvaro Herrera
Date:
On Fri, Jan 30, 2004 at 09:48:19AM -0500, Tom Lane wrote:

> A variant is to make the per-page bit arrays be entries in a hash table
> with page number as hash key.  This would reduce insertion to a nearly
> constant-time operation, but the drawback is that you'd need an explicit
> sort at the end to put the per-page entries into page number order
> before you scan 'em.  You might come out ahead anyway, not sure.

Is there a reason sort the pages before scanning them?  The result won't
come out sorted one way or the other.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Para tener más hay que desear menos"


Re: Question about indexes

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> On Fri, Jan 30, 2004 at 09:48:19AM -0500, Tom Lane wrote:
> 
> > A variant is to make the per-page bit arrays be entries in a hash table
> > with page number as hash key.  This would reduce insertion to a nearly
> > constant-time operation, but the drawback is that you'd need an explicit
> > sort at the end to put the per-page entries into page number order
> > before you scan 'em.  You might come out ahead anyway, not sure.
> 
> Is there a reason sort the pages before scanning them?  The result won't
> come out sorted one way or the other.

I think the goal would be to hit the heap in sequential order as much as
possible.  When we are doing reading right from the index, we haven't
collected all the heap values in one place, but since we have them in
memory, we might as well sort them, though I don't think that is a
requirement, just a performance enhancement, or at least that is my
guess.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Question about indexes

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Alvaro Herrera wrote:
>> Is there a reason sort the pages before scanning them?  The result won't
>> come out sorted one way or the other.

> I think the goal would be to hit the heap in sequential order as much as
> possible.

Exactly.  Also, it'll be harder to AND or OR two bitmaps together if you
don't presort their pages.  (I think the hash representation could do OR
without presort, but not AND.)
        regards, tom lane


Re: Question about indexes

From
Eric Ridge
Date:
On Jan 30, 2004, at 6:31 AM, Bruce Momjian wrote:

> I like the idea of building in-memory bitmapped indexes.

Me too (FWIW)!  This thread is really interesting as the whole idea 
would help to solve the biggest issue with my (currently stalled) 
project to integrate Xapian as a full-text search engine.  Getting 
index scans used in all situations would drastically improve 
performance.

eric



Re: Question about indexes

From
Hannu Krosing
Date:
Tom Lane kirjutas R, 30.01.2004 kell 16:48:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Also, what does an in-memory bitmapped index look like?
>
> One idea that might work: a binary search tree in which each node
> represents a single page of the table, and contains a bit array with
> one bit for each possible item number on the page.  You could not need
> more than BLCKSZ/(sizeof(HeapTupleHeaderData)+sizeof(ItemIdData)) bits
> in a node, or about 36 bytes at default BLCKSZ --- for most tables you
> could probably prove it would be a great deal less.  You only allocate
> nodes for pages that have at least one interesting row.

Another idea would be using bitmaps where we have just one bit per
database page and do a seq scan but just over marked pages.

Even when allocating them in full such indexes would occupy just
1/(8k*8bit) of the amount they describe, so index for 1GB table would be
1G/(8k*8bit) = 16 kilobytes (2 pages)

Also, such indexes, if persistent, could also be used (together with
FSM) when deciding placement of new tuples, so they provide a form of
clustering.

This would of course be most useful for data-warehouse type operations,
where database is significantöy bigger than memory.

And the seqscan over bitmap should not be done in simple page order, but
rather in two passes -1. over those pages which are already in cache (either postgresqls    or systems (if we find a
wayto get such info from the system))2. in sequential order over the rest. 

> I think this would represent a reasonable compromise between size and
> insertion speed.  It would only get large if the indexscan output
> demanded visiting many different pages --- but at some point you could
> abandon index usage and do a sequential scan, so I think that property
> is okay.

One case where almost full intermediate bitmap could be needed is when
doing a star join or just AND of several conditions, where each single
index spans a significant part of the table, but the result does not.

> A variant is to make the per-page bit arrays be entries in a hash table
> with page number as hash key.  This would reduce insertion to a nearly
> constant-time operation, but the drawback is that you'd need an explicit
> sort at the end to put the per-page entries into page number order
> before you scan 'em.  You might come out ahead anyway, not sure.
>
> Or we could try a true linear bitmap (indexed by page number times
> max-items-per-page plus item number) that's compressed in some fashion,
> probably just by eliminating large runs of zeroes.  The difficulty here
> is that inserting a new one-bit could be pretty expensive, and we need
> it to be cheap.
>
> Perhaps someone can come up with other better ideas ...

I have also contemplated a scenario, where we could use some
not-quite-max power-of-2 bits-per-page linear bitmap and mark intra-page
wraps (when we tried to mark a point past that not-quite-max number in a
page) in high bit (or another bitmap) making info for that page folded.
AN example would be setting bit 40 in 32-bits/page index - this would
set bit 40&31 and mark the page folded.

When combining such indexes using AND or OR, we need some spcial
handling of folded pages, but could still get non-folded (0) results out
from AND of 2 folded pages if the bits are distributed nicely.

--------------
Hannu















Re: Question about indexes

From
Tom Lane
Date:
Hannu Krosing <hannu@tm.ee> writes:
> Another idea would be using bitmaps where we have just one bit per
> database page and do a seq scan but just over marked pages.

That seems a bit too lossy for me, but I really like your later idea
about folding.  Generalizing that a little, we can choose any fold point
we like.  We could allocate, say, one 32-bit word per page and set the
(i mod 32) bit when item i is fingered by the index.  After retrieving
the heap page, we'd need to test all the valid rows that have item
numbers matching a set bit mod 32.  On typical tables (with circa 100
items per page) this would require testing only about 3 rows per page.
ORing and ANDing of such bitmaps still works, with the understanding
that it's lossy and you have to double check each retrieved tuple.

If the fold point is above about 100, your idea of keeping track of
whether we actually set any wrapped-around bits would become useful,
but below that I think we'd just be wasting a bit.
        regards, tom lane


Re: Question about indexes

From
Hannu Krosing
Date:
Tom Lane kirjutas L, 31.01.2004 kell 01:02:
> Hannu Krosing <hannu@tm.ee> writes:
> > Another idea would be using bitmaps where we have just one bit per
> > database page and do a seq scan but just over marked pages.
> 
> That seems a bit too lossy for me,

I originally thought of it in context of data-warehousing and persistent
bitmap indexes. there the use of these same bitmaps for clustering would
un-lossify this approach.

>  but I really like your later idea
> about folding.  Generalizing that a little, we can choose any fold point
> we like.  We could allocate, say, one 32-bit word per page and set the
> (i mod 32) bit when item i is fingered by the index.  After retrieving
> the heap page, we'd need to test all the valid rows that have item
> numbers matching a set bit mod 32.  On typical tables (with circa 100
> items per page) this would require testing only about 3 rows per page.
> ORing and ANDing of such bitmaps still works, with the understanding
> that it's lossy and you have to double check each retrieved tuple.
> 
> If the fold point is above about 100, your idea of keeping track of
> whether we actually set any wrapped-around bits would become useful,
> but below that I think we'd just be wasting a bit.

Not only wasting bits, but also making the code hairier - we can't just
do simple ANDs and ORs.

--------------
Hannu



Re: Question about indexes

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> That seems a bit too lossy for me, but I really like your later idea
> about folding.  Generalizing that a little, we can choose any fold point
> we like.  We could allocate, say, one 32-bit word per page and set the
> (i mod 32) bit when item i is fingered by the index.  After retrieving
> the heap page, we'd need to test all the valid rows that have item
> numbers matching a set bit mod 32.  On typical tables (with circa 100
> items per page) this would require testing only about 3 rows per page.
> ORing and ANDing of such bitmaps still works, with the understanding
> that it's lossy and you have to double check each retrieved tuple.

That would make it really hard to ever clear the bits. What do you do when you
vacuum and one of the tuples is no longer needed. You can't be sure you can
clear the bit in the index because there could be multiple tuples represented
by the bit being set. You would have to test the condition on the other tuples
covered by the bit to see if it can be cleared.

-- 
greg



Re: Question about indexes

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> ORing and ANDing of such bitmaps still works, with the understanding
>> that it's lossy and you have to double check each retrieved tuple.

> That would make it really hard to ever clear the bits.

We're speaking of in-memory bitmaps constructed on-the-fly here.  You're
right that it wouldn't work for persistent indexes, but I'm not very
interested in that case at the moment ...
        regards, tom lane