From 63ce4e5578f1703254952cd3aee3a0a22c6da990 Mon Sep 17 00:00:00 2001 From: Masahiko Sawada Date: Tue, 28 Apr 2026 12:21:21 -0700 Subject: [PATCH v2_15] Fix race between ProcSignalInit() and EmitProcSignalBarrier(). Previously, ProcSignalInit() read the global barrier generation before publishing its PID into the pss_pid slot. This created a race condition: a process could initialize its local generation with an older global value, while a concurrent EmitProcSignalBarrier() might skip that process because its pss_pid was still zero. This resulted in WaitForProcSignalBarrier() hanging indefinitely. This commit fixes the issue by publishing pss_pid before reading psh_barrierGeneration, with a memory barrier in between so that the store is globally visible first. A concurrent EmitProcSignalBarrier() then either observes the published PID and signals this slot, or completes its generation increment before we load it. While this race has become more visible due to recent features using signal barriers in more places (such as online wal_level changes), the issue is theoretically present since signal barriers were introduced to release smgr caches (e.g., in DROP DATABASE). So backpatch to 15. This issue was also reported by buildfarm animal flaviventris. Reported-by: Melanie Plageman Reviewed-by: Alexander Lakhin Reviewed-by: Matthias van de Meent Discussion: https://postgr.es/m/CAEze2WgAJmWReDN7Chtba8Er2YBvKCoa0KVN25-1evnTrHsLyA@mail.gmail.com Backpatch-through: 15 --- src/backend/storage/ipc/procsignal.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c index 21a9fc0fdd2..cd4fe11b1a6 100644 --- a/src/backend/storage/ipc/procsignal.c +++ b/src/backend/storage/ipc/procsignal.c @@ -175,6 +175,16 @@ ProcSignalInit(int pss_idx) /* Clear out any leftover signal reasons */ MemSet(slot->pss_signalFlags, 0, NUM_PROCSIGNALS * sizeof(sig_atomic_t)); + /* + * Publish the PID before reading the global barrier generation to ensure + * that EmitProcSignalBarrier() doesn't skip us while we are grabbing an + * older generation. We need a memory barrier here to make sure that the + * update of pss_pid is globally visible before the load of the global + * barrier generation executes. + */ + slot->pss_pid = MyProcPid; + pg_memory_barrier(); + /* * Initialize barrier state. Since we're a brand-new process, there * shouldn't be any leftover backend-private state that needs to be @@ -192,9 +202,6 @@ ProcSignalInit(int pss_idx) pg_atomic_write_u64(&slot->pss_barrierGeneration, barrier_generation); pg_memory_barrier(); - /* Mark slot with my PID */ - slot->pss_pid = MyProcPid; - /* Remember slot location for CheckProcSignal */ MyProcSignalSlot = slot; -- 2.54.0