Re: Bitmap indexes? - Mailing list pgsql-hackers

From Greg Copeland
Subject Re: Bitmap indexes?
Date
Msg-id 1016578854.14670.450.camel@mouse.copelandconsulting.net
Whole thread Raw
In response to Re: Bitmap indexes?  (Matthew Kirkwood <matthew@hairy.beasts.org>)
List pgsql-hackers
On Tue, 2002-03-19 at 15:30, Matthew Kirkwood wrote:
> On Tue, 19 Mar 2002, Oleg Bartunov wrote:
>
> Sorry to reply over you, Oleg.
>
> > On 13 Mar 2002, Greg Copeland wrote:
> >
> > > One of the reasons why I originally stated following the hackers list is
> > > because I wanted to implement bitmap indexes.  I found in the archives,
> > > the follow link, http://www.it.iitb.ernet.in/~rvijay/dbms/proj/, which
> > > was extracted from this,
> > >
http://groups.google.com/groups?hl=en&threadm=01C0EF67.5105D2E0.mascarm%40mascari.com&rnum=1&prev=/groups%3Fq%3Dbitmap%2Bindex%2Bgroup:comp.databases.postgresql.hackers%26hl%3Den%26selm%3D01C0EF67.5105D2E0.mascarm%2540mascari.com%26rnum%3D1,
archivethread. 
>
> For every case I have used a bitmap index on Oracle, a
> partial index[0] made more sense (especialy since it
> could usefully be compound).

That's very true, however, often bitmap indexes are used where partial
indexes may not work well.  It maybe you were trying to apply the cure
for the wrong disease.  ;)

>
> Our troublesome case (on Oracle) is a table of "events"
> where maybe fifty to a couple of hundred are "published"
> (ie. web-visible) at any time.  The events are categorised
> by sport (about a dozen) and by "event type" (about five).
> We never really query events except by PK or by sport/type/
> published.

The reason why bitmap indexes are primarily used for DSS and data
wherehousing applications is because they are best used on extremely
large to very large tables which have low cardinality (e.g, 10,000,000
rows having 200 distinct values).  On top of that, bitmap indexes also
tend to be much smaller than their "standard" cousins.  On large and
very tables tables, this can sometimes save gigs in index space alone
(serious space benefit).  Plus, their small index size tends to result
in much less I/O (serious speed benefit).  This, of course, can result
in several orders of magnitude speed improvements when index scans are
required.  As an added bonus, using AND, OR, XOR and NOT predicates are
exceptionally fast and if implemented properly, can even take advantage
of some 64-bit hardware for further speed improvements.  This, of
course, further speeds look ups.  The primary down side is that inserts
and updates to bitmap indexes are very costly (comparatively) which is,
yet again, why they excel in read-only environments (DSS & data
wherehousing).

It should also be noted that RDMS's, such as Oracle, often use multiple
types of bitmap indexes.  This further impedes insert/update
performance, however, the additional bitmap index types usually allow
for range predicates while still making use of the bitmap index.  If I'm
not mistaken, several other types of bitmaps are available as well as
many ways to encode and compress (rle, quad compression, etc) bitmap
indexes which further save on an already compact indexing scheme.

Given the proper problem domain, index bitmaps can be a big win.

>
> We make a bitmap index on "published", and trust Oracle to
> use it correctly, and hope that our other indexes are also
> useful.
>
> On Postgres[1] we would make a partial compound index:
>
> create index ... on events(sport_id,event_type_id)
> where published='Y';


Generally speaking, bitmap indexes will not serve you very will on
tables having a low row counts, high cardinality or where they are
attached to tables which are primarily used in an OLTP capacity.
Situations where you have a low row count and low cardinality or high
row count and high cardinality tend to be better addressed by partial
indexes; which seem to make much more sense.  In your example, it sounds
like you did "the right thing"(tm).  ;)


Greg


pgsql-hackers by date:

Previous
From: "Rod Taylor"
Date:
Subject: Re: Domains and type coercion
Next
From: Tom Lane
Date:
Subject: Re: libpq: fe_getauthname()