Re: On-disk bitmap index patch - Mailing list pgsql-hackers
From | mark@mark.mielke.cc |
---|---|
Subject | Re: On-disk bitmap index patch |
Date | |
Msg-id | 20060725015805.GA30087@mark.mielke.cc Whole thread Raw |
In response to | Re: On-disk bitmap index patch (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: On-disk bitmap index patch
|
List | pgsql-hackers |
On Mon, Jul 24, 2006 at 09:04:28PM -0400, Bruce Momjian wrote: > Jie Zhang wrote: > > > IIRC they quoted the cardinality of 10000 as something that is still > > > faster than btree for several usecases. > > > > > > And also for AND-s of several indexes, where indexes are BIG, your btree > > > indexes may be almost as big as tables but the resulting set of pages is > > > small. > > Yeah, Hannu points it out very well -- the bitmap index works very well > > when columns have low cardinalities and AND operations will produce small > > number of results. > What operations on columns of low cardinality produce a small number of > results? That seems contradictory. Not necessarily. Cardinality of 2 means 1/2. 3 means 1/3. 4 means 1/4. Etc. Reading 1/4, for a larger table, has a good chance of being faster than reading 4/4 of the table. :-) No opinion on whether such tables exist in the real world - but the concept itself seems sound. > > Also, the bitmap index is very small in low cardinality cases, where the > > btree tends to take up at least 10 times more space. > Also, are adding/changing rows is more expensive with bitmaps than > btrees? Without looking at the code, but having read the Oracle docs, I would guess yes. Every vacuum/delete would need to clear all of the bits for the row. Every insert/update would need to allocate a bit in at least the bitmap tree for the row inserted/updated. Seems like more pages may need to be written. Although perhaps some clever optimization would limit this. :-) It seems interesting though. We won't really know until we see the benchmarks. I'm seeing it as a form of working directly with the intermediate form of the bitmap index scanner. If the existing index scanner, building the bitmaps on the fly can out-perform the regular index scanner, I'm seeing potentially in a pre-built bitmap. Obviously, it isn't the answer to everything. The use I see for it, are a few of my 1:N object attribute tables. The table has an object identifier, and a column indicating the attribute type that the value is for. If I have 20 or more attribute type values, however, most objects include rows for most attribute types, my regular index ends up repeating the attribute type for every row. If I want to scan the table for all rows that have a particular attribute type with a particular value, it's a seqscan right now. With the bitmap scanner, knowing which rows to skip to immediately is readily available. Will it be worth it or not? I won't know until I try it. :-) Cheers, mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bindthem... http://mark.mielke.cc/
pgsql-hackers by date: