Home > mailing lists

Re: tweaking NTUP_PER_BUCKET - Mailing list pgsql-hackers

From	Stephen Frost
Subject	Re: tweaking NTUP_PER_BUCKET
Date	July 6, 2014 07:47:57
Msg-id	20140706044747.GU16422@tamriel.snowman.net Whole thread Raw
In response to	Re: tweaking NTUP_PER_BUCKET (Greg Stark <stark@mit.edu>)
Responses	Re: tweaking NTUP_PER_BUCKET (Tomas Vondra <tv@fuzzy.cz>)
List	pgsql-hackers

Tree view

* Greg Stark (stark@mit.edu) wrote:
> On Thu, Jul 3, 2014 at 11:40 AM, Atri Sharma <atri.jiit@gmail.com> wrote:
> > IIRC, last time when we tried doing bloom filters, I was short of some real
> > world useful hash functions that we could use for building the bloom filter.
>
> Last time was we wanted to use bloom filters in hash joins to filter
> out tuples that won't match any of the future hash batches to reduce
> the amount of tuples that need to be spilled to disk. However the
> problem was that it was unclear for a given amount of memory usage how
> to pick the right size bloom filter and how to model how much it would
> save versus how much it would cost in reduced hash table size.

Right.  There's only one hash function available, really, (we don't
currently support multiple hash functions..), unless we want to try and
re-hash the 32bit hash value that we get back (not against trying that,
but it isn't what I'd start with) and it would hopefully be sufficient
for this.

> I think it just required some good empirical tests and hash join heavy
> workloads to come up with some reasonable guesses. We don't need a
> perfect model just some reasonable bloom filter size that we're pretty
> sure will usually help more than it hurts.

This would help out a lot of things, really..  Perhaps what Tomas is
developing regarding test cases would help here also.
Thanks,
    Stephen

pgsql-hackers by date:

From: Thomas Munro
Date: 06 July 2014, 02:24:48
Subject: Re: DISTINCT with btree skip scan

From: Amit Kapila
Date: 06 July 2014, 08:13:20
Subject: Re: postgresql.auto.conf and reload

Re: tweaking NTUP_PER_BUCKET - Mailing list pgsql-hackers

Previous

Next