Home > mailing lists

potential deadlock in parallel hashjoin grow-buckets-barrier and blocking nodes? - Mailing list pgsql-hackers

From	Luc Vlaming
Subject	potential deadlock in parallel hashjoin grow-buckets-barrier and blocking nodes?
Date	April 13, 2021 13:34:07
Msg-id	3ddf4eab-460d-3cb7-9577-8a4e8f30954d@swarm64.com Whole thread Raw
List	pgsql-hackers

Tree view

Hi,

Whilst trying to debug a deadlock in some tpc-ds query I noticed 
something that could cause problems in the hashjoin implementation and 
cause potentially deadlocks (if my analysis is right).

Whilst building the inner hash table, the whole time the grow barriers 
are attached (the PHJ_BUILD_HASHING_INNER phase).
Usually this is not a problem, however if one of the nodes blocks 
somewhere further down in the plan whilst trying to fill the inner hash 
table whilst the others are trying to e.g. extend the number of buckets 
using ExecParallelHashIncreaseNumBuckets, they would all wait until the 
blocked process comes back to the hashjoin node and also joins the effort.
Wouldn't this give potential deadlock situations? Or why would a worker 
that is hashing the inner be required to come back and join the effort 
in growing the hashbuckets?

With very skewed workloads (one node providing all data) I was at least 
able to have e.g. 3 out of 4 workers waiting in 
ExecParallelHashIncreaseNumBuckets, whilst one was in the 
execprocnode(outernode). I tried to detatch and reattach the barrier but 
this proved to be a bad idea :)

Regards,
Luc

pgsql-hackers by date:

From: Alexander Pyhalov
Date: 13 April 2021, 13:28:40
Subject: CTE push down

From: Craig Ringer
Date: 13 April 2021, 13:40:58
Subject: Re: [PATCH] Identify LWLocks in tracepoints

potential deadlock in parallel hashjoin grow-buckets-barrier and blocking nodes? - Mailing list pgsql-hackers

Previous

Next