Home > mailing lists

Re: tweaking NTUP_PER_BUCKET - Mailing list pgsql-hackers

From	Greg Stark
Subject	Re: tweaking NTUP_PER_BUCKET
Date	July 3, 2014 18:52:31
Msg-id	CAM-w4HPt83UWMmZQG3N+DchW97N5_FEYO-QrJ4zw38_-taTr1A@mail.gmail.com Whole thread
In response to	Re: tweaking NTUP_PER_BUCKET (Atri Sharma <atri.jiit@gmail.com>)
Responses	Re: tweaking NTUP_PER_BUCKET
List	pgsql-hackers

Tree view

On Thu, Jul 3, 2014 at 11:40 AM, Atri Sharma <atri.jiit@gmail.com> wrote:
> IIRC, last time when we tried doing bloom filters, I was short of some real
> world useful hash functions that we could use for building the bloom filter.

Last time was we wanted to use bloom filters in hash joins to filter
out tuples that won't match any of the future hash batches to reduce
the amount of tuples that need to be spilled to disk. However the
problem was that it was unclear for a given amount of memory usage how
to pick the right size bloom filter and how to model how much it would
save versus how much it would cost in reduced hash table size.

I think it just required some good empirical tests and hash join heavy
workloads to come up with some reasonable guesses. We don't need a
perfect model just some reasonable bloom filter size that we're pretty
sure will usually help more than it hurts.

-- 
greg

pgsql-hackers by date:

From: Tomas Vondra
Date: 03 July 2014, 18:50:42
Subject: Re: tweaking NTUP_PER_BUCKET

From: Greg Stark
Date: 03 July 2014, 19:23:38
Subject: Re: "RETURNING PRIMARY KEY" syntax extension

Re: tweaking NTUP_PER_BUCKET - Mailing list pgsql-hackers

Previous

Next