Re: lockup in parallel hash join on dikkop (freebsd 14.0-current) - Mailing list pgsql-hackers

From Andres Freund
Subject Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
Date
Msg-id 20230129175336.dvplrhkoba3pjxpk@awork3.anarazel.de
Whole thread Raw
In response to Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
List pgsql-hackers
Hi,

On 2023-01-29 18:39:05 +0100, Tomas Vondra wrote:
> Will do, but I'll wait for another lockup to see how frequent it
> actually is. I'm now at ~90 runs total, and it didn't happen again yet.
> So hitting it after 15 runs might have been a bit of a luck.

Was there a difference in how much load there was on the machine between
"reproduced in 15 runs" and "not reproed in 90"?  If indeed lack of barriers
is related to the issue, an increase in context switches could substantially
change the behaviour (in both directions).  More intra-process context
switches can amount to "probabilistic barriers" because that'll be a
barrier. At the same time it can make it more likely that the relatively
narrow window in WaitEventSetWait() is hit, or lead to larger delays
processing signals.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
Next
From: Tom Lane
Date:
Subject: Re: Fix GUC_NO_SHOW_ALL test scenario in 003_check_guc.pl