Re: Bloom Filter lookup for hash joins - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: Bloom Filter lookup for hash joins
Date
Msg-id CAFj8pRBscxO58me4zWXd=ZJxzk9Taf+w+L_0-Ckm1ZcSRmGnEA@mail.gmail.com
Whole thread Raw
In response to Bloom Filter lookup for hash joins  (Atri Sharma <atri.jiit@gmail.com>)
Responses Re: Bloom Filter lookup for hash joins
List pgsql-hackers
Hello

it looks interesting - please, try to send some performance results,

Regards

Pavel

2013/6/26 Atri Sharma <atri.jiit@gmail.com>:
> Hi All,
>
> I have been researching bloom filters and discussed it on IRC with
> RhodiumToad and David Fetter, and they pointed me to the various
> places that could potentially have bloom filters, apart from the
> places that already have them currently.
>
> I have been reading the current implementation of hash joins, and in
> ExecScanHashBucket, which I understand is the actual lookup function,
> we could potentially look at a bloom filter per bucket. Instead of
> actually looking up each hash value for the outer relation, we could
> just check the corresponding bloom filter for that bucket, and if we
> get a positive, then lookup the actual values i.e. continue with our
> current behaviour (since we could be looking at a false positive).
>
> This doesn't involve a lot of new logic. We need to add a bloom filter
> in HashJoinState and set it when the hash table is being made. Also,
> one thing we need to look at is the set of hash functions being used
> for the bloom filters. This can be a point of further discussion.
>
> The major potential benefit we could gain is that bloom filters never
> give false negatives. So, we could save a lot of lookup where the
> corresponding bloom filter gives a negative.
>
> This approach can also be particularly useful for hash anti joins,
> where we look for negatives. Since bloom filters can easily be stored
> and processed, by straight bit operations, we could be looking at a
> lot of saving here.
>
> I am not sure if we already do something like this. Please direct me
> to the relevant parts in the code if we already do.
>
> Regards,
>
> Atri
>
> --
> Regards,
>
> Atri
> l'apprenant
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Next
From: Albe Laurenz
Date:
Subject: Re: Kudos for Reviewers -- straw poll