Home > mailing lists

Re: tweaking NTUP_PER_BUCKET - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: tweaking NTUP_PER_BUCKET
Date	July 8, 2014 13:35:19
Msg-id	fd0c13e59dbba297472ffd8bbe20dcd1.squirrel@sq.gransy.com Whole thread Raw
In response to	Re: tweaking NTUP_PER_BUCKET (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: tweaking NTUP_PER_BUCKET Re: tweaking NTUP_PER_BUCKET
List	pgsql-hackers

Tree view

On 8 Červenec 2014, 14:49, Robert Haas wrote:
> On Wed, Jul 2, 2014 at 8:13 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>> I propose dynamic increase of the nbuckets (up to NTUP_PER_BUCKET=1)
>> once the table is built and there's free space in work_mem. The patch
>> mentioned above makes implementing this possible / rather simple.
>
> Another idea would be to start with NTUP_PER_BUCKET=1 and then, if we
> run out of memory, increase NTUP_PER_BUCKET.  I'd like to think that
> the common case is that work_mem will be large enough that everything
> fits; and if you do it that way, then you save yourself the trouble of
> rehashing later, which as you point out might lose if there are only a
> few probes.  If it turns out that you run short of memory, you can
> merge pairs of buckets up to three times, effectively doubling
> NTUP_PER_BUCKET each time.

Maybe. I'm not against setting NTUP_PER_BUCKET=1, but with large outer
relations it may be way cheaper to use higher NTUP_PER_BUCKET values
instead of increasing the number of batches (resulting in repeated scans
of the outer table). I think it's important (but difficult) to keep these
things somehow balanced.

With large work_mem values the amount of memory for buckets may be quite
significant. E.g. 800MB work_mem may easily give ~120MB of memory taken by
buckets with NTUP_PER_BUCKET=1. With NTUP_PER_BUCKET=4 it's just ~30MB.

> Yet another idea is to stick with your scheme, but do the dynamic
> bucket splits lazily.  Set a flag on each bucket indicating whether or
> not it needs a lazy split.  When someone actually probes the hash
> table, if the flag is set for a particular bucket, move any tuples
> that don't belong as we scan the bucket.  If we reach the end of the
> bucket chain, clear the flag.

I don't think a lazy scheme makes much sense here. There are two clearly
separated phases - loading the table and search.

Also, it seems to me it can't work as described. Say we build a table and
then find out we need a table 4x the size. If I understand your proposal
correctly, you'd just resize buckets array, copy the existing buckets
(first 1/4 buckets) and set 'needs_split' flags. Correct?

Well, the problem is that while scanning a bucket you already need to know
which tuples belong there. But with the lazy approach, the tuples might
still be in the old (not yet split) buckets. So you'd have to search the
current bucket and all buckets that were not split yet. Or did I miss
something?

Tomas

pgsql-hackers by date:

From: Robert Haas
Date: 08 July 2014, 12:49:56
Subject: Re: tweaking NTUP_PER_BUCKET

From: Michael Paquier
Date: 08 July 2014, 13:58:56
Subject: Re: Doing better at HINTing an appropriate column within errorMissingColumn()

Re: tweaking NTUP_PER_BUCKET - Mailing list pgsql-hackers

Previous

Next