Re: Bitmap index status - Mailing list pgsql-hackers

From Gavin Sherry
Subject Re: Bitmap index status
Date
Msg-id Pine.LNX.4.58.0609262141460.29077@linuxworld.com.au
Whole thread Raw
In response to Re: Bitmap index status  (Heikki Linnakangas <heikki@enterprisedb.com>)
List pgsql-hackers
On Tue, 26 Sep 2006, Heikki Linnakangas wrote:

> Looks a bit better now, though I think you need to think more about the
> encapsulation of the structs. More detailed comments below.
>
> Jie Zhang wrote:
> > Essentially, we want to have a stream bitmap object that has an iterator,
> > which will be able to iterate through the bitmaps. The BitmapIndexscan,
> > BitmapAnd, BitmapOr will be executed once and return a streamp bitmap or a
> > hash bitmap. The BitmapHeapscan then calls tbm_iterate() to iterate
> > through
> > the bitmaps.
> >
> > The StreamBitmap structure will look like below.
> >
> > struct StreamBitmap {
> > NodeTag type; /* to make it a valid Node */
> > PagetableEntry entry; /* a page of tids in this stream bitmap */
> >
> > /* the iterator function */
> > void (*next)(StreamBitmap*);
> > Node* state; /* store how this stream bitmap generated,
> > and all necessary information to
> > obtain the next stream bitmap. */
> > };
>
> I'd suggest making state just a (void *). It's private to the producer
> of the bitmap, and I don't see a reason to expose it. I assume that the
> next-function fills in the PageTableEntry with the next set of tids.
>
> > Two new state objects will look like below. At the same time, we introduce
> > three new node types: T_StreamBitmapAND, T_StreamBitmapOR,
> > And T_StreamBitmapIndex, to define different states.
> >
> > /*
> > * Stores the necessary information for iterating through the stream
> > bitmaps
> > * generated by nodeBitmapAnd or nodeBitmapOr.
> > */
> > struct StreamBitmapOp {
> > NodeTag type; /* handles T_StreamBitmapAND and T_StreamBitmapOR */
> > List* bitmaps;
> > };
>
> AFAICS, this struct is private to tidbitmap.c, where the new
> stream-enabled tbm_intersect etc. functions are defined. Also, I don't
> see a reason why it needs to by a valid Node.

Well, making it a valid nodes makes it easy to identify (IsA) and gives us
access to copy/equal frameworks. I do think that it is best to bury this
in the bitmap code however.

> > * Stores some necessary information for iterating through the stream
> > * bitmaps generated by nodeBitmapIndexscan.
> > */
> > struct StreamBitmapIndex {
> > NodeTag type; /* handle T_StreamBitmapIndex */
> > IndexScanDesc scan;
> > BlockNumber nextBlockNo;/* next block no to be read */
> > };
>
> Where would this struct be defined? I think different index access
> methods might want to have different kind of states. The struct above
> assumes that the position of an index scan is always represented by the
> nextBlockNo. That seems to be the right thing for the bitmap indexam, so
> this struct is fine for StreamBitmaps returned by bmgetbitmap, but not
> necessary for others.

right.

>
> > Then we will have the iterator functions like the following:
> >
> > ...
> >
> > void StreamBitmapIndexNext(StreamBitmap* node) {
> > StreamBitmapIndex* sbi = (StreamBitmapIndex*) node->state;
> > amgetbitmap(sbi->scan, NULL, sbi->nextBlockNo);
> > }
>
> This means that the amgetbitmap function would still be called many
> times in each scan.  What would amgetbitmap return? A new StreamBitmap
> on each call?
>
> I'd like to see just one call to amgetbitmap. It would a) fill in the
> hash bitmap passed to it, b) return a new hash bitmap, or c) return a
> new StreamBitmap, with a indexam specific next-function that returns the
> tids one page at a time. I think we'll also need some kind of a
> destructor in StreamBitmap that's called by the consumer of the bitmap
> after it's done with it.

Right, I agree. I am working on this now.

Thanks,

gavin


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Release Notes: Major Changes in 8.2
Next
From: Heikki Linnakangas
Date:
Subject: heap_markpos and heap_restrpos