Re: failures in t/031_recovery_conflict.pl on CI - Mailing list pgsql-hackers

From Andres Freund
Subject Re: failures in t/031_recovery_conflict.pl on CI
Date
Msg-id 20220409231002.mddrr7mtca7p55v5@alap3.anarazel.de
Whole thread Raw
In response to Re: failures in t/031_recovery_conflict.pl on CI  (Andres Freund <andres@anarazel.de>)
Responses Re: failures in t/031_recovery_conflict.pl on CI  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Hi,

On 2022-04-09 15:00:54 -0700, Andres Freund wrote:
> What are we expecting to wake the startup process up, once it does
> SendRecoveryConflictWithBufferPin()?
>
> It's likely not the problem here, because we never seem to have even reached
> that path, but afaics once we've called disable_all_timeouts() at the bottom
> of ResolveRecoveryConflictWithBufferPin() and then re-entered
> ResolveRecoveryConflictWithBufferPin(), and go into the "We're already behind,
> so clear a path as quickly as possible."  path, there's no guarantee for any
> timeout to be pending anymore?
>
> If there's either no backend that we're still conflicting with (an entirely
> possible race condition), or if there's e.g. a snapshot or database conflict,
> there's afaics nobody setting the startup processes' latch.

It's not that (although I still suspect it's a problem). It's a self-deadlock,
because StandbyTimeoutHandler(), which ResolveRecoveryConflictWithBufferPin()
*explicitly enables*, calls SendRecoveryConflictWithBufferPin(). Which does
CancelDBBackends(). Which ResolveRecoveryConflictWithBufferPin() also calls,
if the deadlock timeout is reached.

To make it easier to hit, I put a pg_usleep(10000) in CancelDBBackends(), and boom:

(gdb) bt
#0  __futex_abstimed_wait_common64 (futex_word=futex_word@entry=0x7fd4cb001138, expected=expected@entry=0,
clockid=clockid@entry=0,
    abstime=abstime@entry=0x0, private=<optimized out>, cancel=cancel@entry=true) at
../sysdeps/nptl/futex-internal.c:87
#1  0x00007fd4ce5a215b in __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x7fd4cb001138,
expected=expected@entry=0,
    clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=<optimized out>) at
../sysdeps/nptl/futex-internal.c:123
#2  0x00007fd4ce59e44f in do_futex_wait (sem=sem@entry=0x7fd4cb001138, abstime=0x0, clockid=0) at sem_waitcommon.c:112
#3  0x00007fd4ce59e4e8 in __new_sem_wait_slow64 (sem=0x7fd4cb001138, abstime=0x0, clockid=0) at sem_waitcommon.c:184
#4  0x00007fd4cf20d823 in PGSemaphoreLock (sema=0x7fd4cb001138) at pg_sema.c:327
#5  0x00007fd4cf2dd675 in LWLockAcquire (lock=0x7fd4cb001600, mode=LW_EXCLUSIVE) at
/home/andres/src/postgresql/src/backend/storage/lmgr/lwlock.c:1324
#6  0x00007fd4cf2c36e7 in CancelDBBackends (databaseid=0, sigmode=PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
conflictPending=false)
    at /home/andres/src/postgresql/src/backend/storage/ipc/procarray.c:3638
#7  0x00007fd4cf2cc579 in SendRecoveryConflictWithBufferPin (reason=PROCSIG_RECOVERY_CONFLICT_BUFFERPIN)
    at /home/andres/src/postgresql/src/backend/storage/ipc/standby.c:846
#8  0x00007fd4cf2cc69d in StandbyTimeoutHandler () at
/home/andres/src/postgresql/src/backend/storage/ipc/standby.c:911
#9  0x00007fd4cf4e68d7 in handle_sig_alarm (postgres_signal_arg=14) at
/home/andres/src/postgresql/src/backend/utils/misc/timeout.c:421
#10 <signal handler called>
#11 0x00007fd4cddfffc4 in __GI___select (nfds=0, readfds=0x0, writefds=0x0, exceptfds=0x0, timeout=0x7ffc6e5561c0) at
../sysdeps/unix/sysv/linux/select.c:71
#12 0x00007fd4cf52ea2a in pg_usleep (microsec=10000) at /home/andres/src/postgresql/src/port/pgsleep.c:56
#13 0x00007fd4cf2c36f1 in CancelDBBackends (databaseid=0, sigmode=PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
conflictPending=false)
    at /home/andres/src/postgresql/src/backend/storage/ipc/procarray.c:3640
#14 0x00007fd4cf2cc579 in SendRecoveryConflictWithBufferPin (reason=PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK)
    at /home/andres/src/postgresql/src/backend/storage/ipc/standby.c:846
#15 0x00007fd4cf2cc50f in ResolveRecoveryConflictWithBufferPin () at
/home/andres/src/postgresql/src/backend/storage/ipc/standby.c:820
#16 0x00007fd4cf2a996f in LockBufferForCleanup (buffer=43) at
/home/andres/src/postgresql/src/backend/storage/buffer/bufmgr.c:4336
#17 0x00007fd4ceec911c in XLogReadBufferForRedoExtended (record=0x7fd4d106a618, block_id=0 '\000', mode=RBM_NORMAL,
get_cleanup_lock=true,buf=0x7ffc6e556394)
 
    at /home/andres/src/postgresql/src/backend/access/transam/xlogutils.c:427
#18 0x00007fd4cee1aa41 in heap_xlog_prune (record=0x7fd4d106a618) at
/home/andres/src/postgresql/src/backend/access/heap/heapam.c:8634

it's reproducible on linux.


I'm lacking words I dare to put in an email to describe how bad an idea it is
to call CancelDBBackends() from within a timeout function, particularly when
the function enabling the timeout also calls that function itself. Before
disabling timers.

I ...

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?
Next
From: "S.R Keshav"
Date:
Subject: GSOC: New and improved website for pgjdbc (JDBC) (2022)