Re: Bitmap index thoughts - Mailing list pgsql-hackers

From Gavin Sherry
Subject Re: Bitmap index thoughts
Date
Msg-id Pine.LNX.4.58.0702022301520.32019@linuxworld.com.au
Whole thread Raw
In response to Re: Bitmap index thoughts  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Bitmap index thoughts  (Heikki Linnakangas <heikki@enterprisedb.com>)
Re: Bitmap index thoughts  (Heikki Linnakangas <heikki@enterprisedb.com>)
List pgsql-hackers
On Thu, 1 Feb 2007, Bruce Momjian wrote:

>
> Where are we on this patch?  Does it have performance tests to show
> where it is beneificial?  Is it ready to be reviewed?

Here's an updated patch:

http://www.alcove.com.au/~swm/bitmap-2007-02-02.patch

In this patch, I rewrote the index build system. It was fast before for
well clustered data but for poorly clustered data, it was very slow. Now,
it is pretty good for each distribution type.

I have various test cases but the one which showed bitmap a poor light was
a table of 600M rows. The key to the table had a cardinality of 100,000.
When the table was loaded with keys clustered, the build time was 1000
seconds with bitmap (2200 with btree). With poorly clustered data (e.g.,
the index key was (1, 2, 3, ..., 6000, 1, 2, 3, ...)), the build time for
bitmap was 14000 seconds!

So, I rewrote this to compress data using HRL encoding (the same scheme we
use in the bitmap AM itself). Now, clustered data is just as fast and
unclustered data is 2000 seconds.

The select performance at a cardinality of 100,000 is similar to btree but
faster with lower cardinalities.

Jie also contributed a rewrite of the WAL code to this patch. Not only is
the code faster now, but it handles the notion of incomplete actions --
like btree and friends do. The executor code still needs some work from me
-- Jie and I have dirtied things up while experimenting -- but we would
really like some review of the code so that this can get squared away
well before the approach of 8.3 feature freeze.

One of the major deficiencies remaining is the lack of VACUUM support.
Heikki put his hand up for this and I'm holding him to it! ;-)

I will update the code tomorrow. The focus will be cleaning up the
executor modifications. Please look else where for now.

Thanks,

Gavin


pgsql-hackers by date:

Previous
From: "Florian G. Pflug"
Date:
Subject: Re: Referential Integrity and SHARE locks
Next
From: "Florian G. Pflug"
Date:
Subject: Re: Data archiving/warehousing idea