Re: On-disk bitmap index implementation - Mailing list pgsql-patches

From Heikki Linnakangas
Subject Re: On-disk bitmap index implementation
Date
Msg-id 45744B5C.8030308@enterprisedb.com
Whole thread Raw
In response to On-disk bitmap index implementation  (Gavin Sherry <swm@linuxworld.com.au>)
Responses Re: On-disk bitmap index implementation  ("Jie Zhang" <jzhang@greenplum.com>)
Re: On-disk bitmap index implementation  ("Heikki Linnakangas" <heikki@enterprisedb.com>)
Re: On-disk bitmap index implementation  (Gavin Sherry <swm@linuxworld.com.au>)
List pgsql-patches
Gavin Sherry wrote:
> Hi all,
>
> Attached is a patch implementing bitmap indexes. It includes major
> enhancements on the patch submitted during feature freeze for 8.2 here[1].
>
> In particular: much better integration with the existing bitmap scan code
> with the internals of the bitmap streaming pushed down into the AM and
> hidden from the executor code; completely new index creation algorithm
> which reduced creation time by 20-75% depending on the data; modifications
> to the encoding mechanism to suit the integration with bitmap index scans;
> work on memory management; lots of code rewriting; range query support.
> The code is also much cleaner now.

Thanks! I'll take a look at it.

We need to give the indexam API some further thought. As you know, I've
been working on the Grouped Index Tuples stuff, which also requires
changes to the API to get full benefit. There's a bunch of functionality
I'd like to see:

* Support for streamed bitmaps, like you have implemented.

* Support for candidate matches. This is needed for GIT, as well as
range-encoded bitmap indexes if/when we add them.

* Support for returning tuples in partial order. This is again needed
for GIT, because grouped tuples don't keep track of the ordering within
the group, so they need to be sorted if ordering necessary. And again
it's also useful to return tuples in order from range-encoded bitmaps.

* Support for kill_prior_tuple on bitmap scans.

* A bulk insert API. When inserting a lot of tuples with similar keys,
we could a considerable amount of CPU with a bulk insert API. A bulk
insert to a B-tree for example would only need to descend the tree once,
find the insert location, lock the target page just once and insert all
the tuples that belong to that page. That would potentially also reduce
WAL traffic.

> There are still some things Jie and I have not gotten to yet:
>
> o Improving VACUUM support -- currently, VACUUM FULL means REINDEX for
>   bitmaps. Heikki Linnakangas offered to work on this. Heikki, are you
>   still interested?

Yeah, I can look into that.

> o Test WAL replay more thoroughly.

I've had that problem too with a lot of things I've hacked. I've used a
shell script that does the operation under test, runs a select, kills
and restarts postmaster, and reruns the select. If the select after
crash returns the same result as before, presumably WAL code works. But
you need to watch out for full page writes that might mask bugs in the
redo code.

Anyone have a more sophisticated method?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com


pgsql-patches by date:

Previous
From: Chris Browne
Date:
Subject: ECPG docs
Next
From: "Heikki Linnakangas"
Date:
Subject: Re: ECPG docs