On Thu, Jul 9, 2020 at 8:17 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> Last week as I was working on adaptive hash join [1] and trying to get
> parallel adaptive hash join batch 0 to spill correctly, I noticed what
> seemed like a problem with the code to repartition batch 0.
>
> If we run out of space while inserting tuples into the hashtable during
> the build phase of parallel hash join and proceed to increase the number
> of batches, we need to repartition all of the tuples from the old
> generation (when nbatch was x) and move them to their new homes in the
> new generation (when nbatch is 2x). Before we do this repartitioning we
> disable growth in the number of batches.
>
> Then we repartition the tuples from the hashtable, inserting them either
> back into the hashtable or into a batch file. While inserting them into
> the hashtable, we call ExecParallelHashTupleAlloc(), and, if there is no
> space for the current tuple in the current chunk and growth in the
> number of batches is disabled, we go ahead and allocate a new chunk of
> memory -- regardless of whether or not we will exceed the space limit.
Hmm. It shouldn't really be possible for
ExecParallelHashRepartitionFirst() to run out of memory anyway,
considering that the input of that operation previously fit (just... I
mean we started repartitioning because one more chunk would have
pushed us over the edge, but the tuples so far fit, and we'll insert
them in the same order for each input chunk, possibly filtering some
out). Perhaps you reached this condition because
batches[0].shared->size finishes up accounting for the memory used by
the bucket array in PHJ_GROW_BUCKETS_ELECTING, but didn't originally
account for it in generation 0, so what previously appeared to fit no
longer does :-(. I'll look into that.