Re: On-disk bitmap index implementation - Mailing list pgsql-patches

From Gavin Sherry
Subject Re: On-disk bitmap index implementation
Date
Msg-id Pine.LNX.4.58.0612051100520.23921@linuxworld.com.au
Whole thread Raw
In response to Re: On-disk bitmap index implementation  ("Simon Riggs" <simon@2ndquadrant.com>)
List pgsql-patches
On Mon, 4 Dec 2006, Simon Riggs wrote:

> On Tue, 2006-12-05 at 00:18 +1100, Gavin Sherry wrote:
>
> > o Determine if we need to provide anything for rm_startup, rm_cleanup,
> >   rm_safe_restartpoint RmgrData function pointers.
>
> safe_restartpoint gives true/false based upon whether there are
> multi-record WAL states that have only been partially received. For
> example, a btree index split needs multiple WAL records as it recurses
> up the index tree. If you've got one record but not the others yet you
> have an incomplete state and so cannot safely write a restartpoint.
>
> I'll document that if you/anyone might suggest where the best place is?

transam/README ?

>
> > o Look into adding an AM option such that the user can determine word size
> >   at index creation time. For higher-cardinality data (above 1000 distinct
> >   values), 16 bit word sizes can really help with performance. Although
> >   the word size is not just assumed to be a certain size across the code,
> >   macros are used extensively to interact with the word size. Making it
> >   different for each index might be a little messy.
>
> ...and is is it a typical case to have a bitmap with less than 1000
> distinct values?? Surely we want that as the sole assumption?
>
> Nearly unique bitmaps can suffer a little I think, if it makes the most
> common case faster. But I'd like to see the perf results first, I guess.

I'll put together some performance data on different word sizes.

Thanks,

Gavin

pgsql-patches by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: win32.mak patch of pg_dump.
Next
From: Gavin Sherry
Date:
Subject: Re: On-disk bitmap index implementation