Explanation for bug #13908: hash joins are badly broken - Mailing list pgsql-hackers

From Tom Lane
Subject Explanation for bug #13908: hash joins are badly broken
Date
Msg-id 7034.1454722453@sss.pgh.pa.us
Whole thread Raw
Responses Re: Explanation for bug #13908: hash joins are badly broken
Re: Explanation for bug #13908: hash joins are badly broken
List pgsql-hackers
I have sussed what's happening in bug #13908.  Basically, commit
45f6240a8fa9d355 ("Pack tuples in a hash join batch densely, to save
memory") broke things for the case where a hash join is using a skew
table.  The reason is that that commit only changed the storage of tuples
going into the main hash table; tuples going into the skew table are
still allocated with a palloc apiece, without being put into the "chunk"
storage.  Now, if we're loading the hash table and we find that we've
exceeded the storage allowed for skew tuples, ExecHashRemoveNextSkewBucket
wants to push some skew tuples back into the main hash table; and it
believes that linking such tuples into the appropriate main hashbucket
chain is sufficient for that.  Which it was before the aforesaid commit,
and still is in simple cases.  However, if we later try to do
ExecHashIncreaseNumBatches, that function contains new code that assumes
that it can find all tuples in the main hashtable by scanning the "chunk"
storage directly.  Thus, the pushed-back tuples are not scanned and are
neither re-entered into the hash table nor dumped into a batch file.
So they won't get joined.

It looks like ExecHashIncreaseNumBuckets, if it were to run after some
executions of ExecHashRemoveNextSkewBucket, would break things in the same
way.  That's not what's happening in this particular test case, though.

I'm of the opinion that this is a stop-ship bug for 9.5.1.  Barring
somebody producing a fix over the weekend, I will deal with it by
reverting the aforementioned commit.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: proposal: make NOTIFY list de-duplication optional
Next
From: Noah Misch
Date:
Subject: Re: LLVM Address Sanitizer (ASAN) and valgrind support