Thread: using new bitmap scans to index bit columns?
I like the new bitmap scans and I'm wondering if there is any way I can utilize them for my bitmask column, defined as bit(1024). I use this column as a sort of fingerprint to quickly scan my tables. But it is a scan, not an index. I have not figured out a way to index the bitmask column. Is there some way it can be used as an index now that there are bitmap scans in 8.1? Currently I do this: Select * from mytable where contains(bitmask, fingerprint(user_data)) and matches(datacolumn, user_data); user_data is a string, like a regexp but with different semantics for chemical data. bitmask is precomputed/stored as bit(1024) = fingerprint(datacolumn) contains(a,b) returns bool as 'select b=(a&b);' This works well because matches() is an expensive functions. But it would work better if bitmask could be indexed, no? TJ O'Donnell
On Wed, 9 Nov 2005, TJ O'Donnell wrote: > I like the new bitmap scans and I'm wondering if there is any way > I can utilize them for my bitmask column, defined as bit(1024). > I use this column as a sort of fingerprint to quickly scan my tables. > But it is a scan, not an index. I have not figured out a way to > index the bitmask column. Is there some way it can be used as > an index now that there are bitmap scans in 8.1? > > Currently I do this: > Select * from mytable where contains(bitmask, fingerprint(user_data)) > and matches(datacolumn, user_data); > > user_data is a string, like a regexp but with different semantics for > chemical data. > bitmask is precomputed/stored as bit(1024) = fingerprint(datacolumn) > contains(a,b) returns bool as 'select b=(a&b);' > > This works well because matches() is an expensive functions. > But it would work better if bitmask could be indexed, no? You can use GiST to do that. > > TJ O'Donnell > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
On Wed, Nov 09, 2005 at 10:18:35AM -0800, TJ O'Donnell wrote: > I like the new bitmap scans and I'm wondering if there is any way > I can utilize them for my bitmask column, defined as bit(1024). > I use this column as a sort of fingerprint to quickly scan my tables. > But it is a scan, not an index. I have not figured out a way to > index the bitmask column. Is there some way it can be used as > an index now that there are bitmap scans in 8.1? Note: the fact that they are called bitmaps indexes doesn't imply anything about the types used. The fact that you have a field already as a bitmap doesn't actually help. The operation still needs to be an indexable. However, bitmaps indexes does mean that a single query can use multiple indexes. So if you can split your contains into different parts of the string, the optimozier can combine them. Whether this is more efficient, who knows... Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.