Re: [PATCHES] WIP: bitmap indexes - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [PATCHES] WIP: bitmap indexes |
Date | |
Msg-id | 200609052230.k85MU6P03691@momjian.us Whole thread Raw |
In response to | Re: [PATCHES] WIP: bitmap indexes ("Jie Zhang" <jzhang@greenplum.com>) |
List | pgsql-hackers |
This has been saved for the 8.3 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --------------------------------------------------------------------------- Jie Zhang wrote: > > > On 8/15/06 6:18 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > > > Gavin Sherry <swm@linuxworld.com.au> writes: > >> On Mon, 14 Aug 2006, Tom Lane wrote: > >>> Correct me if I'm wrong, but isn't the patch's present hacking on the > >>> executor intended to make it happen like this? > > > >> Not really. It reads ahead on the bitmap index and passes back the bitmap > >> words. The other executor routines are hacked up to process the data in > >> this format. > > > > Well, as I said, I don't think there's justification for exposing a > > bitmap index's internal data formats to the rest of the system like > > that: it's not very future-proof and I don't see that it's buying any > > significant performance gain. At some point you have to convert to TIDs > > anyway, at least in the sense of knowing what page and line number you > > are at, so passing the data as an array of TIDs really isn't going to > > hurt much. So my advice is to rip out all those changes and go back to > > the existing tidbitmap.c readout API. There's nothing wrong with > > the TBMIterateResult data structure. > > > > The bitmap words in the bitmap index are very simple and can be very > generic. You can think about them as one bit per tuple along with some > padding bits between heap pages. The problem I have is that I do not know a > good way to construct an in-memory version of this for other index > structures, like b-tree. To be able to handle both cases nicely, you are > right -- TBMIterateResult is better. Or, PagetableEntry may be better since > it will make AND/OR easier. > > > What I do find interesting to think about is whether, strictly within > > tidbitmap.c, there could be an alternate kind of bitmap object that > > doesn't have to materialize the whole bitmap for an indexscan in memory > > because it knows it can fetch the data on-demand, ie, build the next > > page TBMIterateResult data structure on-the-fly from the index when it's > > requested. Call it a "stream bitmap" in contrast to the present > > "materialized bitmaps". The trick here is to be able to AND and OR a > > stream bitmap with another stream bitmap or a materialized bitmap. > > I don't see any reason in principle why that couldn't be done: the > > output of the AND/OR would be a stream of TBMIterateResults just like > > the inputs. That is, it's another stream bitmap, but with a different > > generating function and some internal state that includes its source > > bitmaps. You'd have to sort a materialized bitmap into order before > > starting to AND/OR it with a stream bitmap, but that code is there > > already. > > I like this idea. I think that we can define a new TBMStatus to be > TBM_STREAM in TIDBitmap. *getmulti functions will remain the same, except > that we add a new returning bool argument, stating if this is a stream > bitmap. If this is a stream bitmap, nodeBitmapIndexScan simply fills spages, > and passes it upstream. When nodeBitmapAnd or nodeBitmapOr ANDs/ORs several > bitmaps, the result bitmap is a stream bitmap if there is at least one > bitmap is a stream bitmap. Then we add another loop in nodeBitmapHeapscan to > be able to pull more data from its subnode. > > Thanks, > Jie > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
pgsql-hackers by date: