Thread: potential deadlock in parallel hashjoin grow-buckets-barrier and blocking nodes?
potential deadlock in parallel hashjoin grow-buckets-barrier and blocking nodes?
From
Luc Vlaming
Date:
Hi, Whilst trying to debug a deadlock in some tpc-ds query I noticed something that could cause problems in the hashjoin implementation and cause potentially deadlocks (if my analysis is right). Whilst building the inner hash table, the whole time the grow barriers are attached (the PHJ_BUILD_HASHING_INNER phase). Usually this is not a problem, however if one of the nodes blocks somewhere further down in the plan whilst trying to fill the inner hash table whilst the others are trying to e.g. extend the number of buckets using ExecParallelHashIncreaseNumBuckets, they would all wait until the blocked process comes back to the hashjoin node and also joins the effort. Wouldn't this give potential deadlock situations? Or why would a worker that is hashing the inner be required to come back and join the effort in growing the hashbuckets? With very skewed workloads (one node providing all data) I was at least able to have e.g. 3 out of 4 workers waiting in ExecParallelHashIncreaseNumBuckets, whilst one was in the execprocnode(outernode). I tried to detatch and reattach the barrier but this proved to be a bad idea :) Regards, Luc