Home > mailing lists

Re: bad estimation together with large work_mem generates terrible slow hash joins - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: bad estimation together with large work_mem generates terrible slow hash joins
Date	June 26, 2014 21:44:07
Msg-id	53AC69EF.2000607@fuzzy.cz Whole thread Raw
In response to	Re: bad estimation together with largework_mem generates terrible slow hash joins (Tomas Vondra <tv@fuzzy.cz>)
Responses	Re: bad estimation together with large work_mem generates terrible slow hash joins (Tomas Vondra <tv@fuzzy.cz>)
List	pgsql-hackers

Tree view

Attached is v2 of the patch, with some cleanups / minor improvements:

* improved comments, whitespace fixed / TODOs etc.

* tracking inital # of buckets (similar to initial # of batches)

* adding info about buckets to EXPLAIN ANALYZE, similar to batches - I
didn't want to make it overly complex, so the info about initial
bucket/batch count is added if at least one them is modified

* modified threshold triggering the growth, to get NTUP_PER_BUCKET on
average (see the NTUP_GROW_THRESHOLD comment nodeHash.c)

* there's a single FIXME, related to counting tuples in the

One thing that's important to note is the difference between # of
batches and # of buckets. While one # of batches is "global" the # of
buckets is 'within a batch'. So theoretically each batch can use
different number of buckets.

However the value is reused between batches, so it only grows. That
means this is possible:

  initial: 1024 buckets (before 1st batch)
  batch 1: 1024 buckets
  batch 2: 1024 buckets
  batch 3: 4096 buckets
  batch 4: 8192 buckets

while this is not:

  initial: 1024 buckets (before 1st batch)
  batch 1: 1024 buckets
  batch 2: 4096 buckets
  batch 3: 1024 buckets
  batch 4: 8192 buckets

However in practice I expect the first batch will to do all the work,
and the following batches will just reuse the same number of buckets.
This of course assumes the batches have similar tuple sizes etc.

So the first batch will do all the reshuffling the tables, and the
following batches will reuse the 'right' number of buckets from the start.

regards
Tomas

Attachment

hashjoin-nbuckets-growth-v2.patch

pgsql-hackers by date:

From: Robert Haas
Date: 26 June 2014, 21:37:34
Subject: Re: better atomics - v0.5

From: Tom Lane
Date: 26 June 2014, 21:47:19
Subject: Re: better atomics - v0.5

Re: bad estimation together with large work_mem generates terrible slow hash joins - Mailing list pgsql-hackers

Attachment

Previous

Next