Re: State of the on-disk bitmap index - Mailing list pgsql-hackers

From Daniel Bausch
Subject Re: State of the on-disk bitmap index
Date
Msg-id 50471B32.6000206@dvs.tu-darmstadt.de
Whole thread Raw
In response to Re: State of the on-disk bitmap index  ("Albe Laurenz" <laurenz.albe@wien.gv.at>)
Responses Re: State of the on-disk bitmap index  (Gianni Ciolli <gianni.ciolli@2ndquadrant.it>)
List pgsql-hackers
Hi Albe and the list,

>> I am going to implement a simple kind of "encoded bitmap indexes" (EBI).
>>
>> I thought, it could be a good idea to base my work on the long proposed
>> on-disk bitmap index implementation.  Regarding to the wiki, you,
>> Jonah and Simon, were the last devs that touched this thing.  Unfortunately
>> I could not find the patch representing your state of that work.  I
>> could only capture the development history up to Gianni Ciolli & Gabriele
>> Bartolini from the old pgsql-patches archives.  Other people involved
>> were Jie Zhang, Gavin Sherry, Heikki Linnakangas, and Leonardo F.  Are
>> you and the others still interested in getting this into PG?  A rebase
>> of the most current bitmap index implementation onto master HEAD will
>> be the first 'byproduct' that I am going to deliver back to you.
>>
>> 1. Is anyone working on this currently?
>> 2. Who has got the most current source code?
>> 3. Is there a git of that or will I need to reconstruct the history
>> from
>> the patches I collected?
>
> It seems like you did not get any answers from any of the
> people you mentioned ...
>
> The latest version of the patch I found is
> http://archives.postgresql.org/pgsql-patches/2006-12/msg00015.php
> So that's probably the best you can get.
>
> I want to encourage you to work on this.

Yes I do.  Thank you for your support.

I used the (more recent) patches posted by Gianni Ciolli in 2008 and
currently am in the process of porting those to master HEAD -- like I
promised.  I will post the ported patches when I get them to compile and
the index seems to work (somehow).

Nevertheless, I am still interested in what Simon, Jonah, and Leonardo
did after that point in time.  So if someone knows details (code) about
their solutions to, for example, the VACUUM problems, please mail back.

> You'd have to come up with a sound concept and discuss it on this
> list, and it would be helpful to have some draft patch for
> git master that can be used as a basis for discussion.
>
> Expect to meet some resistance.  Nobody will want the extra
> code and complexity unless you can show suffitient benefits.

If noone wants that, it would be sad.  However, I will at least do all
the work required to run benchmark queries against it.  Nevertheless, I
appreciate any help.

Indeed, the patch is a big one and the approach seems a bit hacky at
some places.  I also suspect that the compression approach could be
improved/replaced by something that is more efficient compression wise.
However, I never could come up with an own solution that complete in the
time available for my current project.

> Another criticism I can imagine is that PostgreSQL already
> supports a bitmap index scan of b-tree indexes, so you would
> have to show that on-disk bitmap indexes outperform that
> in realistic scenarios.  This has probably become more
> difficult with the recently introduced index-only scan
> for b-tree indexes, which is particularly helpful in
> data warehouse scenarios.

IIRC, it was already shown that bitmap indexes can speed up TPC-H
queries.  I will compare B+-tree, bitmap, and encoded bitmap indexes.

> So you'd have to run some performance tests against a draft
> implementation to get people convinced that it is worth the
> effort.  Supporting index-only scans Would probably give
> you an edge.

Yes I will, because I am going to write about that.

Kind regards,
Daniel

--
Daniel Bausch
Wissenschaftlicher Mitarbeiter
Technische Universität Darmstadt
Fachbereich Informatik
Fachgebiet Datenbanken und Verteilte Systeme

Hochschulstraße 10
64289 Darmstadt
Germany

Tel.: +49 6151 16 6706
Fax:  +49 6151 16 6229



pgsql-hackers by date:

Previous
From: Thom Brown
Date:
Subject: Re: 9.2rc1 produces incorrect results
Next
From: Pavel Stehule
Date:
Subject: fixing variadic parameters for type "ANY"