On Wed, Jun 26, 2013 at 7:12 PM, Stephen Frost <sfrost@snowman.net> wrote:
> Atri,
>
> * Atri Sharma (atri.jiit@gmail.com) wrote:
>> I just popped in here on Simon's advice to put an idea I had about
>> optimizing hash joins on this thread.
>
> I'd encourage reading the thread a bit first, in the future.. :)
>
Yeah, I actually read a bit(admitted, not much) of the above thread. I
was following it a bit as well.
> I suggested this up-thread already, but it's not really a bloom filter
> as there's only one hash function available- I can't see us requiring
> every data type to provide multiple hash functions. Still, I do think
> breaking the single 32-bit hash key space up into fixed-sized chunks and
> then having a bitfield array which we test against (very similar to how
> the visibility map works) to see if there's any chance that a given hash
> key exists might be valuable. The problem is that, because we don't
> have multiple hash functions, it's not clear how much "empty" space we'd
> actually end up with.
Agreed.
> We could have a bitfield filter (as I described above) created for each
> bucket and then test against that before considering if we actually have
> to go look in that bucket, yes. I'm not sure if that's quite what you
> were thinking, but I can see how a bitfield per bucket might work. If
> you were suggesting something else, please clarify.
Yeah, this is what I wanted.
My point is that I would like to help in the implementation, if possible. :)
Regards,
Atri
--
Regards,
Atri
l'apprenant