Re: Bitmap indexes? - Mailing list pgsql-hackers
From | Greg Copeland |
---|---|
Subject | Re: Bitmap indexes? |
Date | |
Msg-id | 1016578854.14670.450.camel@mouse.copelandconsulting.net Whole thread Raw |
In response to | Re: Bitmap indexes? (Matthew Kirkwood <matthew@hairy.beasts.org>) |
List | pgsql-hackers |
On Tue, 2002-03-19 at 15:30, Matthew Kirkwood wrote: > On Tue, 19 Mar 2002, Oleg Bartunov wrote: > > Sorry to reply over you, Oleg. > > > On 13 Mar 2002, Greg Copeland wrote: > > > > > One of the reasons why I originally stated following the hackers list is > > > because I wanted to implement bitmap indexes. I found in the archives, > > > the follow link, http://www.it.iitb.ernet.in/~rvijay/dbms/proj/, which > > > was extracted from this, > > > http://groups.google.com/groups?hl=en&threadm=01C0EF67.5105D2E0.mascarm%40mascari.com&rnum=1&prev=/groups%3Fq%3Dbitmap%2Bindex%2Bgroup:comp.databases.postgresql.hackers%26hl%3Den%26selm%3D01C0EF67.5105D2E0.mascarm%2540mascari.com%26rnum%3D1, archivethread. > > For every case I have used a bitmap index on Oracle, a > partial index[0] made more sense (especialy since it > could usefully be compound). That's very true, however, often bitmap indexes are used where partial indexes may not work well. It maybe you were trying to apply the cure for the wrong disease. ;) > > Our troublesome case (on Oracle) is a table of "events" > where maybe fifty to a couple of hundred are "published" > (ie. web-visible) at any time. The events are categorised > by sport (about a dozen) and by "event type" (about five). > We never really query events except by PK or by sport/type/ > published. The reason why bitmap indexes are primarily used for DSS and data wherehousing applications is because they are best used on extremely large to very large tables which have low cardinality (e.g, 10,000,000 rows having 200 distinct values). On top of that, bitmap indexes also tend to be much smaller than their "standard" cousins. On large and very tables tables, this can sometimes save gigs in index space alone (serious space benefit). Plus, their small index size tends to result in much less I/O (serious speed benefit). This, of course, can result in several orders of magnitude speed improvements when index scans are required. As an added bonus, using AND, OR, XOR and NOT predicates are exceptionally fast and if implemented properly, can even take advantage of some 64-bit hardware for further speed improvements. This, of course, further speeds look ups. The primary down side is that inserts and updates to bitmap indexes are very costly (comparatively) which is, yet again, why they excel in read-only environments (DSS & data wherehousing). It should also be noted that RDMS's, such as Oracle, often use multiple types of bitmap indexes. This further impedes insert/update performance, however, the additional bitmap index types usually allow for range predicates while still making use of the bitmap index. If I'm not mistaken, several other types of bitmaps are available as well as many ways to encode and compress (rle, quad compression, etc) bitmap indexes which further save on an already compact indexing scheme. Given the proper problem domain, index bitmaps can be a big win. > > We make a bitmap index on "published", and trust Oracle to > use it correctly, and hope that our other indexes are also > useful. > > On Postgres[1] we would make a partial compound index: > > create index ... on events(sport_id,event_type_id) > where published='Y'; Generally speaking, bitmap indexes will not serve you very will on tables having a low row counts, high cardinality or where they are attached to tables which are primarily used in an OLTP capacity. Situations where you have a low row count and low cardinality or high row count and high cardinality tend to be better addressed by partial indexes; which seem to make much more sense. In your example, it sounds like you did "the right thing"(tm). ;) Greg
pgsql-hackers by date: