Thread: Bitmap index status

Bitmap index status

From

Heikki Linnakangas

Date:

12 September 2006, 09:44:26

Hi,

What's the status of the bitmap index patch? Have you worked on it since 
the last posted patch 
(http://archives.postgresql.org/pgsql-patches/2006-08/msg00003.php)?

I've started to review it, to get it into CVS early in the 8.3 cycle. I 
just want to make sure that I'm working on the latest version.

Beside the issues already discussed, I found two minor bugs:
* pg_am says that bitmap am supports unique indexes, while it doesn't. 
Second,
* race condition in _bitmap_inserttuple if two backends try to insert 
the same, new value. If they both find that there's no lov item for the 
key, and try to create one, one backend will get a duplicate key error 
on the lov index.

Also, vacuum actually does a reindex, which seems awfully wasteful. That 
needs to be looked at.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: Bitmap index status

From

Tom Lane

Date:

12 September 2006, 14:12:03

Heikki Linnakangas <heikki@enterprisedb.com> writes:
> What's the status of the bitmap index patch? Have you worked on it since 
> the last posted patch 
> (http://archives.postgresql.org/pgsql-patches/2006-08/msg00003.php)?

Gavin and Jie have made major changes since that version (or at least
they'd better have something to show for the month since then ;-)).
I wouldn't recommend reviewing the patch until they post something
current ...

> Also, vacuum actually does a reindex, which seems awfully wasteful. That 
> needs to be looked at.

Yikes.  I imagine they've not tried to do anything about that; if you
want to help, maybe you could take that subproblem?
        regards, tom lane

Re: Bitmap index status

From

"Jie Zhang"

Date:

12 September 2006, 16:24:04

Hi Heikki,

Gavin and I are trying to merge our changes together this week. We will post
a new patch by the end of this week. This patch will include some style
fixes, bug fixes, and the stream bitmap implementation.

I will look into the problems you have mentioned in this email. Yes, vacuum
currently does a reindex now. Gavin and I just talked about this yesterday.
We are looking into ways to improve this. One way is not to do reindex for
each vacuum. We maintain a list of updated tids along with the bitmap index.
Only when this list goes to a certain point, vacuum will re-build the index.

Thanks,
Jie

On 9/12/06 2:43 AM, "Heikki Linnakangas" <heikki@enterprisedb.com> wrote:

> Hi,
> 
> What's the status of the bitmap index patch? Have you worked on it since
> the last posted patch
> (http://archives.postgresql.org/pgsql-patches/2006-08/msg00003.php)?
> 
> I've started to review it, to get it into CVS early in the 8.3 cycle. I
> just want to make sure that I'm working on the latest version.
> 
> Beside the issues already discussed, I found two minor bugs:
> * pg_am says that bitmap am supports unique indexes, while it doesn't.
> Second,
> * race condition in _bitmap_inserttuple if two backends try to insert
> the same, new value. If they both find that there's no lov item for the
> key, and try to create one, one backend will get a duplicate key error
> on the lov index.
> 
> Also, vacuum actually does a reindex, which seems awfully wasteful. That
> needs to be looked at.

Re: Bitmap index status

From

"Jie Zhang"

Date:

17 September 2006, 08:57:36

Hi Heikki and all,

I just sent the latest bitmap index patch to the list. I am not sure if
there is any size limit for this mailing list. If you have received my
previous email, please let me know.

Thanks,
Jie

On 9/12/06 2:43 AM, "Heikki Linnakangas" <heikki@enterprisedb.com> wrote:

> Hi,
> 
> What's the status of the bitmap index patch? Have you worked on it since
> the last posted patch
> (http://archives.postgresql.org/pgsql-patches/2006-08/msg00003.php)?
> 
> I've started to review it, to get it into CVS early in the 8.3 cycle. I
> just want to make sure that I'm working on the latest version.
> 
> Beside the issues already discussed, I found two minor bugs:
> * pg_am says that bitmap am supports unique indexes, while it doesn't.
> Second,
> * race condition in _bitmap_inserttuple if two backends try to insert
> the same, new value. If they both find that there's no lov item for the
> key, and try to create one, one backend will get a duplicate key error
> on the lov index.
> 
> Also, vacuum actually does a reindex, which seems awfully wasteful. That
> needs to be looked at.

Re: Bitmap index status

From

Heikki Linnakangas

Date:

19 September 2006, 11:55:02

Jie Zhang wrote:
> Hi Heikki and all,
>
> Please find the latest bitmap index patch in the attachment. This patch is
> generated against the postgresql cvs head.
>   

Thanks.

The handling of stream and hash bitmaps looks pretty complicated to me. 
All the bitmap-related nodes have logic to handle both types slightly 
differently. It all seems to come down to that if a subnode (or 
amgetbitmap in a bitmap index scan node) returns a StreamBitmap, the 
caller needs to call the subnode many times, until it returns a NULL. 
With a HashBitmap, the caller only calls the subnode once.

I think amgetbitmap should be called just once per index scan, and it 
should return either a hash bitmap or a stream bitmap. The same applies 
to all the executor nodes that return bitmaps, they would only return a 
single HashBitmap or StreamBitmap, and the upper node would call 
tbm_iterate repeatedly on that.

StreamBitmap would contain a callback (filled by the indexam) that 
tbm_iterate would call to fill the next TBMIterateResult.

I would also move the code to AND and OR together different types of 
bitmaps to tidbitmap.c, so that BitmapAnd and BitmapOr nodes wouldn't 
need to care about different types either.

tbm_intersect and tbm_union with two HashBitmaps would work like they do 
now. Calling tbm_intersect or tbm_union with two StreamBitmaps would 
return a new StreamBitmap object that would pull results from the two 
argument StreamBitmaps page at a time and AND or OR them together. 
Combining a HashBitmap and a StreamBitmap would wrap the HashBitmap with 
a new StreamBitmap that pulls the entries from the HashBitmap page at a 
time.

What do you think? Would you like me to help implementing the above?

> This patch includes code style changes, bug fixes, and the stream bitmap
> implementation. I have also fixed the problems mentioned in Heikki's email.
>   

It seems you fixed the race condition in inserttuple by holding a write 
lock on the metapage while you find/create the lov item. While correct, 
isn't that pretty bad for concurrency? I realize that the main use case 
for bitmap indexes is large data warehouses where you don't do 
concurrent updates, but still...

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: Bitmap index status

From

Gavin Sherry

Date:

19 September 2006, 12:16:52

On Tue, 19 Sep 2006, Heikki Linnakangas wrote:

> Jie Zhang wrote:
> > Hi Heikki and all,
> >
> > Please find the latest bitmap index patch in the attachment. This patch is
> > generated against the postgresql cvs head.
> >
>
> Thanks.
>
> The handling of stream and hash bitmaps looks pretty complicated to me.
> All the bitmap-related nodes have logic to handle both types slightly
> differently. It all seems to come down to that if a subnode (or
> amgetbitmap in a bitmap index scan node) returns a StreamBitmap, the
> caller needs to call the subnode many times, until it returns a NULL.
> With a HashBitmap, the caller only calls the subnode once.
>
> I think amgetbitmap should be called just once per index scan, and it
> should return either a hash bitmap or a stream bitmap. The same applies
> to all the executor nodes that return bitmaps, they would only return a
> single HashBitmap or StreamBitmap, and the upper node would call
> tbm_iterate repeatedly on that.
>
> StreamBitmap would contain a callback (filled by the indexam) that
> tbm_iterate would call to fill the next TBMIterateResult.

Right, this was the approach taken by an earlier version of the patch I
had worked on. It was significantly uglified by the need to keep the index
state around and to be careful about what amrescan might do behind out
back. I will, however, introduce the idea again because it makes the code
much cleaner and more logical, as you seem to suggest.

Thanks,

Gavin

Re: Bitmap index status

From

"Jie Zhang"

Date:

20 September 2006, 16:04:36

On 9/19/06 5:15 AM, "Gavin Sherry" <swm@linuxworld.com.au> wrote:

> On Tue, 19 Sep 2006, Heikki Linnakangas wrote:
> 
>> Jie Zhang wrote:
>>> Hi Heikki and all,
>>> 
>>> Please find the latest bitmap index patch in the attachment. This patch is
>>> generated against the postgresql cvs head.
>>> 
>> 
>> Thanks.
>> 
>> The handling of stream and hash bitmaps looks pretty complicated to me.
>> All the bitmap-related nodes have logic to handle both types slightly
>> differently. It all seems to come down to that if a subnode (or
>> amgetbitmap in a bitmap index scan node) returns a StreamBitmap, the
>> caller needs to call the subnode many times, until it returns a NULL.
>> With a HashBitmap, the caller only calls the subnode once.
>> 
>> I think amgetbitmap should be called just once per index scan, and it
>> should return either a hash bitmap or a stream bitmap. The same applies
>> to all the executor nodes that return bitmaps, they would only return a
>> single HashBitmap or StreamBitmap, and the upper node would call
>> tbm_iterate repeatedly on that.
>> 
>> StreamBitmap would contain a callback (filled by the indexam) that
>> tbm_iterate would call to fill the next TBMIterateResult.
> 
> Right, this was the approach taken by an earlier version of the patch I
> had worked on. It was significantly uglified by the need to keep the index
> state around and to be careful about what amrescan might do behind out
> back. I will, however, introduce the idea again because it makes the code
> much cleaner and more logical, as you seem to suggest.

I will think about this approach. But I am not quite convinced that this
approach will be simpler and cleaner than the above approach. :-)

Thanks,
Jie

Re: Bitmap index status

From

Mark Wong

Date:

22 September 2006, 22:45:42

Jie Zhang wrote:
> Hi Heikki and all,
> 
> I just sent the latest bitmap index patch to the list. I am not sure if
> there is any size limit for this mailing list. If you have received my
> previous email, please let me know.

Hi Jie,

I know I said I was going to get testing on this months ago but I've 
been juggling between 3 systems due to disk failures and other hardware 
configuration issues.  Anyways, I've take a baseline run of only the 
power test using a 1GB database with the patch 09-17 patch against a 
snapshot of pgsql from 2006-09-17:

http://dbt.osdl.org/dbt/dbt3testing/results/dev8-007/2/

Do you think the 1GB scale factor will be sufficient for testing as it 
will certainly be faster?  Do you think testing with just a power test 
will be sufficient for now?  I really don't have a good reason why I 
didn't run a throughput test other than to save time. :)  I also wanted 
to get your opinion again on which indexes we will want to try first.

Thanks,
Mark

Re: Bitmap index status

From

"Jie Zhang"

Date:

24 September 2006, 03:26:13

Gavin & Heikki,

>> 
>> The handling of stream and hash bitmaps looks pretty complicated to me.
>> All the bitmap-related nodes have logic to handle both types slightly
>> differently. It all seems to come down to that if a subnode (or
>> amgetbitmap in a bitmap index scan node) returns a StreamBitmap, the
>> caller needs to call the subnode many times, until it returns a NULL.
>> With a HashBitmap, the caller only calls the subnode once.
>> 
>> I think amgetbitmap should be called just once per index scan, and it
>> should return either a hash bitmap or a stream bitmap. The same applies
>> to all the executor nodes that return bitmaps, they would only return a
>> single HashBitmap or StreamBitmap, and the upper node would call
>> tbm_iterate repeatedly on that.
>> 
>> StreamBitmap would contain a callback (filled by the indexam) that
>> tbm_iterate would call to fill the next TBMIterateResult.
> 
> Right, this was the approach taken by an earlier version of the patch I
> had worked on. It was significantly uglified by the need to keep the index
> state around and to be careful about what amrescan might do behind out
> back. I will, however, introduce the idea again because it makes the code
> much cleaner and more logical, as you seem to suggest.
> 

I have been thinking about this approach some more. I do agree that this is
more attractive now. The following includes some more detailed design.
Please let me know if you have any comments. (My apologies to Gavin. You
talked to me about this approach before. But you introduced some on-disk
bitmap specific code into the tidbitmap.c, which prevented me from looking
more in this direction.)

Essentially, we want to have a stream bitmap object that has an iterator,
which will be able to iterate through the bitmaps. The BitmapIndexscan,
BitmapAnd, BitmapOr will be executed once and return a streamp bitmap or a
hash bitmap. The BitmapHeapscan then calls tbm_iterate() to iterate through
the bitmaps.

The StreamBitmap structure will look like below.

struct StreamBitmap {   NodeTag           type;       /* to make it a valid Node */   PagetableEntry    entry;      /*
apage of tids in this stream bitmap */
 
   /* the iterator function */   void    (*next)(StreamBitmap*);   Node*           state;        /* store how this
streambitmap generated,                                    and all necessary information to
      obtain the next stream bitmap. */
 
};

Two new state objects will look like below. At the same time, we introduce
three new node types: T_StreamBitmapAND, T_StreamBitmapOR,
And T_StreamBitmapIndex, to define different states.

/** Stores the necessary information for iterating through the stream bitmaps* generated by nodeBitmapAnd or
nodeBitmapOr.*/
struct StreamBitmapOp {   NodeTag     type;  /* handles T_StreamBitmapAND and T_StreamBitmapOR */   List*
bitmaps;
};

/** Stores some necessary information for iterating through the stream* bitmaps generated by nodeBitmapIndexscan.*/
struct StreamBitmapIndex {   NodeTag     type; /* handle T_StreamBitmapIndex */   IndexScanDesc    scan;   BlockNumber
     nextBlockNo;/* next block no to be read */
 
};

Then we will have the iterator functions like the following:

void StreamBitmapAndNext(StreamBitmap* node) { tbm_intersect_stream(((StreampBitmapOp*) node->state)->bitmaps, node);
}

void StreamBitmapOrNext(StreamBitmap* node) { tbm_union_stream(((StreampBitmapOp*) node->state)->bitmaps, node);
}

void StreamBitmapIndexNext(StreamBitmap* node) { StreamBitmapIndex* sbi = (StreamBitmapIndex*) node->state;
amgetbitmap(sbi->scan,NULL, sbi->nextBlockNo);
 
}

What do you think?

Thanks,
Jie

Re: Bitmap index status

From

"Jie Zhang"

Date:

25 September 2006, 18:20:47

Hi Mark,

Thanks for doing the test. I checked out the link you provided below. I am a
little confused about the goal of these tests. Do you plan to test the
overall performance of postgreSQL on handling TPC-H queries?

Thanks,
Jie

On 9/22/06 3:45 PM, "Mark Wong" <markw@osdl.org> wrote:

> Jie Zhang wrote:
>> Hi Heikki and all,
>> 
>> I just sent the latest bitmap index patch to the list. I am not sure if
>> there is any size limit for this mailing list. If you have received my
>> previous email, please let me know.
> 
> Hi Jie,
> 
> I know I said I was going to get testing on this months ago but I've
> been juggling between 3 systems due to disk failures and other hardware
> configuration issues.  Anyways, I've take a baseline run of only the
> power test using a 1GB database with the patch 09-17 patch against a
> snapshot of pgsql from 2006-09-17:
> 
> http://dbt.osdl.org/dbt/dbt3testing/results/dev8-007/2/
> 
> Do you think the 1GB scale factor will be sufficient for testing as it
> will certainly be faster?  Do you think testing with just a power test
> will be sufficient for now?  I really don't have a good reason why I
> didn't run a throughput test other than to save time. :)  I also wanted
> to get your opinion again on which indexes we will want to try first.
> 
> Thanks,
> Mark
>

Re: Bitmap index status

From

Mark Wong

Date:

25 September 2006, 18:33:06

Hi Jie,

Yeah, basically gather as many stats as I can to accurately profile the 
overall system performance.  I thought it would be appropriate to use a 
TPC-H based workload as one measuring stick to use for bitmap indexes.

Mark

Jie Zhang wrote:
> Hi Mark,
> 
> Thanks for doing the test. I checked out the link you provided below. I am a
> little confused about the goal of these tests. Do you plan to test the
> overall performance of postgreSQL on handling TPC-H queries?
> 
> Thanks,
> Jie
> 
> On 9/22/06 3:45 PM, "Mark Wong" <markw@osdl.org> wrote:
> 
>> Jie Zhang wrote:
>>> Hi Heikki and all,
>>>
>>> I just sent the latest bitmap index patch to the list. I am not sure if
>>> there is any size limit for this mailing list. If you have received my
>>> previous email, please let me know.
>> Hi Jie,
>>
>> I know I said I was going to get testing on this months ago but I've
>> been juggling between 3 systems due to disk failures and other hardware
>> configuration issues.  Anyways, I've take a baseline run of only the
>> power test using a 1GB database with the patch 09-17 patch against a
>> snapshot of pgsql from 2006-09-17:
>>
>> http://dbt.osdl.org/dbt/dbt3testing/results/dev8-007/2/
>>
>> Do you think the 1GB scale factor will be sufficient for testing as it
>> will certainly be faster?  Do you think testing with just a power test
>> will be sufficient for now?  I really don't have a good reason why I
>> didn't run a throughput test other than to save time. :)  I also wanted
>> to get your opinion again on which indexes we will want to try first.
>>
>> Thanks,
>> Mark
>>

Re: Bitmap index status

From

"Luke Lonergan"

Date:

25 September 2006, 20:43:54

Mark,

On 9/25/06 11:32 AM, "Mark Wong" <markw@osdl.org> wrote:

> Yeah, basically gather as many stats as I can to accurately profile the
> overall system performance.  I thought it would be appropriate to use a
> TPC-H based workload as one measuring stick to use for bitmap indexes.

Note that the TPC-H queries don't follow the typical good use case for
bitmap indexes.  You'd like to see queries that use multiple AND and OR
clauses, otherwise there may be no benefit.

Also, DBT-3/TPC-H on Postgres right now does not benefit from indices
overall.  The planner has limitations WRT selectivity estimates and other
limitations that cause it to choose index access poorly for the query
workload.  We have two new features coming (for 8.3) that fix this, but for
now we find that indexes are a net loss, in some queries a huge loss.

If you look at the whitepaper that Ayush Parashar published, he uses the
TPC-H data with some targeted queries that showcase the best use-cases for
bitmap index.

- Luke

Re: Bitmap index status

From

Heikki Linnakangas

Date:

26 September 2006, 09:17:37

Looks a bit better now, though I think you need to think more about the 
encapsulation of the structs. More detailed comments below.

Jie Zhang wrote:
> Essentially, we want to have a stream bitmap object that has an iterator,
> which will be able to iterate through the bitmaps. The BitmapIndexscan,
> BitmapAnd, BitmapOr will be executed once and return a streamp bitmap or a
> hash bitmap. The BitmapHeapscan then calls tbm_iterate() to iterate 
> through
> the bitmaps.
>
> The StreamBitmap structure will look like below.
>
> struct StreamBitmap {
> NodeTag type; /* to make it a valid Node */
> PagetableEntry entry; /* a page of tids in this stream bitmap */
>
> /* the iterator function */
> void (*next)(StreamBitmap*);
> Node* state; /* store how this stream bitmap generated,
> and all necessary information to
> obtain the next stream bitmap. */
> };

I'd suggest making state just a (void *). It's private to the producer 
of the bitmap, and I don't see a reason to expose it. I assume that the 
next-function fills in the PageTableEntry with the next set of tids.

> Two new state objects will look like below. At the same time, we introduce
> three new node types: T_StreamBitmapAND, T_StreamBitmapOR,
> And T_StreamBitmapIndex, to define different states.
>
> /*
> * Stores the necessary information for iterating through the stream 
> bitmaps
> * generated by nodeBitmapAnd or nodeBitmapOr.
> */
> struct StreamBitmapOp {
> NodeTag type; /* handles T_StreamBitmapAND and T_StreamBitmapOR */
> List* bitmaps;
> };

AFAICS, this struct is private to tidbitmap.c, where the new 
stream-enabled tbm_intersect etc. functions are defined. Also, I don't 
see a reason why it needs to by a valid Node.

> /*
> * Stores some necessary information for iterating through the stream
> * bitmaps generated by nodeBitmapIndexscan.
> */
> struct StreamBitmapIndex {
> NodeTag type; /* handle T_StreamBitmapIndex */
> IndexScanDesc scan;
> BlockNumber nextBlockNo;/* next block no to be read */
> };

Where would this struct be defined? I think different index access 
methods might want to have different kind of states. The struct above 
assumes that the position of an index scan is always represented by the 
nextBlockNo. That seems to be the right thing for the bitmap indexam, so 
this struct is fine for StreamBitmaps returned by bmgetbitmap, but not 
necessary for others.

> Then we will have the iterator functions like the following:
>
> ...
>
> void StreamBitmapIndexNext(StreamBitmap* node) {
> StreamBitmapIndex* sbi = (StreamBitmapIndex*) node->state;
> amgetbitmap(sbi->scan, NULL, sbi->nextBlockNo);
> }

This means that the amgetbitmap function would still be called many 
times in each scan.  What would amgetbitmap return? A new StreamBitmap 
on each call?

I'd like to see just one call to amgetbitmap. It would a) fill in the 
hash bitmap passed to it, b) return a new hash bitmap, or c) return a 
new StreamBitmap, with a indexam specific next-function that returns the 
tids one page at a time. I think we'll also need some kind of a 
destructor in StreamBitmap that's called by the consumer of the bitmap 
after it's done with it.

-- 
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Re: Bitmap index status

From

Gavin Sherry

Date:

26 September 2006, 11:45:47

On Tue, 26 Sep 2006, Heikki Linnakangas wrote:

> Looks a bit better now, though I think you need to think more about the
> encapsulation of the structs. More detailed comments below.
>
> Jie Zhang wrote:
> > Essentially, we want to have a stream bitmap object that has an iterator,
> > which will be able to iterate through the bitmaps. The BitmapIndexscan,
> > BitmapAnd, BitmapOr will be executed once and return a streamp bitmap or a
> > hash bitmap. The BitmapHeapscan then calls tbm_iterate() to iterate
> > through
> > the bitmaps.
> >
> > The StreamBitmap structure will look like below.
> >
> > struct StreamBitmap {
> > NodeTag type; /* to make it a valid Node */
> > PagetableEntry entry; /* a page of tids in this stream bitmap */
> >
> > /* the iterator function */
> > void (*next)(StreamBitmap*);
> > Node* state; /* store how this stream bitmap generated,
> > and all necessary information to
> > obtain the next stream bitmap. */
> > };
>
> I'd suggest making state just a (void *). It's private to the producer
> of the bitmap, and I don't see a reason to expose it. I assume that the
> next-function fills in the PageTableEntry with the next set of tids.
>
> > Two new state objects will look like below. At the same time, we introduce
> > three new node types: T_StreamBitmapAND, T_StreamBitmapOR,
> > And T_StreamBitmapIndex, to define different states.
> >
> > /*
> > * Stores the necessary information for iterating through the stream
> > bitmaps
> > * generated by nodeBitmapAnd or nodeBitmapOr.
> > */
> > struct StreamBitmapOp {
> > NodeTag type; /* handles T_StreamBitmapAND and T_StreamBitmapOR */
> > List* bitmaps;
> > };
>
> AFAICS, this struct is private to tidbitmap.c, where the new
> stream-enabled tbm_intersect etc. functions are defined. Also, I don't
> see a reason why it needs to by a valid Node.

Well, making it a valid nodes makes it easy to identify (IsA) and gives us
access to copy/equal frameworks. I do think that it is best to bury this
in the bitmap code however.

> > * Stores some necessary information for iterating through the stream
> > * bitmaps generated by nodeBitmapIndexscan.
> > */
> > struct StreamBitmapIndex {
> > NodeTag type; /* handle T_StreamBitmapIndex */
> > IndexScanDesc scan;
> > BlockNumber nextBlockNo;/* next block no to be read */
> > };
>
> Where would this struct be defined? I think different index access
> methods might want to have different kind of states. The struct above
> assumes that the position of an index scan is always represented by the
> nextBlockNo. That seems to be the right thing for the bitmap indexam, so
> this struct is fine for StreamBitmaps returned by bmgetbitmap, but not
> necessary for others.

right.

>
> > Then we will have the iterator functions like the following:
> >
> > ...
> >
> > void StreamBitmapIndexNext(StreamBitmap* node) {
> > StreamBitmapIndex* sbi = (StreamBitmapIndex*) node->state;
> > amgetbitmap(sbi->scan, NULL, sbi->nextBlockNo);
> > }
>
> This means that the amgetbitmap function would still be called many
> times in each scan.  What would amgetbitmap return? A new StreamBitmap
> on each call?
>
> I'd like to see just one call to amgetbitmap. It would a) fill in the
> hash bitmap passed to it, b) return a new hash bitmap, or c) return a
> new StreamBitmap, with a indexam specific next-function that returns the
> tids one page at a time. I think we'll also need some kind of a
> destructor in StreamBitmap that's called by the consumer of the bitmap
> after it's done with it.

Right, I agree. I am working on this now.

Thanks,

gavin

Re: Bitmap index status

From

Mark Wong

Date:

28 September 2006, 16:50:17

Luke Lonergan wrote:
> Mark,
> 
> On 9/25/06 11:32 AM, "Mark Wong" <markw@osdl.org> wrote:
> 
>> Yeah, basically gather as many stats as I can to accurately profile the
>> overall system performance.  I thought it would be appropriate to use a
>> TPC-H based workload as one measuring stick to use for bitmap indexes.
> 
> Note that the TPC-H queries don't follow the typical good use case for
> bitmap indexes.  You'd like to see queries that use multiple AND and OR
> clauses, otherwise there may be no benefit.

Oh right, people keep telling me that and it keeps going in one ear and 
out the other...

> Also, DBT-3/TPC-H on Postgres right now does not benefit from indices
> overall.  The planner has limitations WRT selectivity estimates and other
> limitations that cause it to choose index access poorly for the query
> workload.  We have two new features coming (for 8.3) that fix this, but for
> now we find that indexes are a net loss, in some queries a huge loss.

Great, I'll keep my eye for those. :)

> If you look at the whitepaper that Ayush Parashar published, he uses the
> TPC-H data with some targeted queries that showcase the best use-cases for
> bitmap index.

Mark