Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler? - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler? |
Date | |
Msg-id | 20220701231833.vh76xkf3udani3np@alap3.anarazel.de Whole thread Raw |
In response to | Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler? (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?
|
List | pgsql-hackers |
HHi, On 2022-07-01 13:14:23 -0700, Andres Freund wrote: > I saw a couple failures of 031 on CI for the meson patch - which obviously > doesn't change anything around this. However it adds a lot more distributions, > and the added ones run in docker containers on a shared host, presumably > adding a lot of noise. I saw this more frequently when I accidentally had the > test runs at the number of CPUs in the host, rather than the allotted CPUs in > the container. > > That made me look more into these issues. I played with adding a pg_usleep() > to RecoveryConflictInterrupt() to simulate slow signal delivery. > > Found a couple things: > > - the pg_usleep(5000) in ResolveRecoveryConflictWithVirtualXIDs() can > completely swamp the target(s) on a busy system. This surely is exascerbated > by the usleep I added RecoveryConflictInterrupt() but a 5ms signalling pace > does seem like a bad idea. This one is also implicated in https://postgr.es/m/20220701154105.jjfutmngoedgiad3%40alvherre.pgsql and related issues. Besides being very short, it also seems like a bad idea to wait when we might not need to? Seems we should only wait if we subsequently couldn't get the lock? Misleadingly WaitExceedsMaxStandbyDelay() also contains a usleep, which at least I wouldn't expect given its name. A minimal fix would be to increase the wait time, similar how it is done with standbyWait_us? Medium term it seems we ought to set the startup process's latch when handling a conflict, and use a latch wait. But avoiding races probably isn't quite trivial. > - we process the same recovery conflict (not a single signal, but a single > "real conflict") multiple times in the target of a conflict, presumably > while handling the error. That includes handling the same interrupt once as > an ERROR and once as FATAL. > > E.g. > > 2022-07-01 12:19:14.428 PDT [2000572] LOG: recovery still waiting after 10.032 ms: recovery conflict on buffer pin > 2022-07-01 12:19:14.428 PDT [2000572] CONTEXT: WAL redo at 0/344E098 for Heap2/PRUNE: latestRemovedXid 0 nredirected 0ndead 100; blkref #0: rel 1663/16385/1> > 2022-07-01 12:19:54.597 PDT [2000578] 031_recovery_conflict.pl ERROR: canceling statement due to conflict with recoveryat character 15 > 2022-07-01 12:19:54.597 PDT [2000578] 031_recovery_conflict.pl DETAIL: User transaction caused buffer deadlock with recovery. > 2022-07-01 12:19:54.597 PDT [2000578] 031_recovery_conflict.pl STATEMENT: SELECT * FROM test_recovery_conflict_table2; > 2022-07-01 12:19:54.778 PDT [2000572] LOG: recovery finished waiting after 40359.937 ms: recovery conflict on buffer pin > 2022-07-01 12:19:54.778 PDT [2000572] CONTEXT: WAL redo at 0/344E098 for Heap2/PRUNE: latestRemovedXid 0 nredirected 0ndead 100; blkref #0: rel 1663/16385/1> > 2022-07-01 12:19:54.788 PDT [2000578] 031_recovery_conflict.pl FATAL: terminating connection due to conflict with recovery > 2022-07-01 12:19:54.788 PDT [2000578] 031_recovery_conflict.pl DETAIL: User transaction caused buffer deadlock with recovery. > 2022-07-01 12:19:54.788 PDT [2000578] 031_recovery_conflict.pl HINT: In a moment you should be able to reconnect to thedatabase and repeat your command. > 2022-07-01 12:19:54.837 PDT [2001389] 031_recovery_conflict.pl LOG: statement: SELECT 1; > > note that the startup process considers the conflict resolved by the time > the backend handles the interrupt. I guess the reason we first get an ERROR and then a FATAL is that the second iteration hits the if (RecoveryConflictPending && DoingCommandRead) bit, because we end up there after handling the first error? And that's a FATAL. I suspect that Thomas' fix will address at least part of this, as the check whether we're still waiting for a lock will be made just before the error is thrown. > I also see cases where a FATAL is repeated: > > 2022-07-01 12:43:18.190 PDT [2054721] LOG: recovery still waiting after 15.410 ms: recovery conflict on snapshot > 2022-07-01 12:43:18.190 PDT [2054721] DETAIL: Conflicting process: 2054753. > 2022-07-01 12:43:18.190 PDT [2054721] CONTEXT: WAL redo at 0/344AB90 for Heap2/PRUNE: latestRemovedXid 732 nredirected18 ndead 0; blkref #0: rel 1663/16385/> > 2054753: reporting recovery conflict 9 > 2022-07-01 12:43:18.482 PDT [2054753] 031_recovery_conflict.pl FATAL: terminating connection due to conflict with recovery > 2022-07-01 12:43:18.482 PDT [2054753] 031_recovery_conflict.pl DETAIL: User query might have needed to see row versionsthat must be removed. > 2022-07-01 12:43:18.482 PDT [2054753] 031_recovery_conflict.pl HINT: In a moment you should be able to reconnect to thedatabase and repeat your command. > ... > 2054753: reporting recovery conflict 9 > 2022-07-01 12:43:19.068 PDT [2054753] 031_recovery_conflict.pl FATAL: terminating connection due to conflict with recovery > 2022-07-01 12:43:19.068 PDT [2054753] 031_recovery_conflict.pl DETAIL: User query might have needed to see row versionsthat must be removed. > 2022-07-01 12:43:19.068 PDT [2054753] 031_recovery_conflict.pl HINT: In a moment you should be able to reconnect to thedatabase and repeat your command. > > the FATAL one seems like it might at least partially be due to > RecoveryConflictPending not being reset in at least some of the FATAL > recovery conflict paths. > > It seems pretty obvious that the proc_exit_inprogress check in > RecoveryConflictInterrupt() is misplaced, and needs to be where the errors > are thrown. But that won't help, because it turns out, we don't yet set that > necessarily. > > Look at this stack from an assertion in ProcessInterrupts() ensuring that > the same FATAL isn't raised twice: > > #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49 > #1 0x00007fd47897b546 in __GI_abort () at abort.c:79 > #2 0x00005594c150b27a in ExceptionalCondition (conditionName=0x5594c16fe746 "!in_fatal", errorType=0x5594c16fd8f6 "FailedAssertion", > fileName=0x5594c16fdac0 "/home/andres/src/postgresql/src/backend/tcop/postgres.c", lineNumber=3259) > at /home/andres/src/postgresql/src/backend/utils/error/assert.c:69 > #3 0x00005594c134f6d2 in ProcessInterrupts () at /home/andres/src/postgresql/src/backend/tcop/postgres.c:3259 > #4 0x00005594c150c671 in errfinish (filename=0x5594c16b8f2e "pqcomm.c", lineno=1393, funcname=0x5594c16b95e0 <__func__.8>"internal_flush") > at /home/andres/src/postgresql/src/backend/utils/error/elog.c:683 > #5 0x00005594c115e059 in internal_flush () at /home/andres/src/postgresql/src/backend/libpq/pqcomm.c:1393 > #6 0x00005594c115df49 in socket_flush () at /home/andres/src/postgresql/src/backend/libpq/pqcomm.c:1340 > #7 0x00005594c15121af in send_message_to_frontend (edata=0x5594c18a5740 <errordata>) at /home/andres/src/postgresql/src/backend/utils/error/elog.c:3283 > #8 0x00005594c150f00e in EmitErrorReport () at /home/andres/src/postgresql/src/backend/utils/error/elog.c:1541 > #9 0x00005594c150c42e in errfinish (filename=0x5594c16fdaed "postgres.c", lineno=3266, funcname=0x5594c16ff5b0 <__func__.9>"ProcessInterrupts") > at /home/andres/src/postgresql/src/backend/utils/error/elog.c:592 > #10 0x00005594c134f770 in ProcessInterrupts () at /home/andres/src/postgresql/src/backend/tcop/postgres.c:3266 > #11 0x00005594c134b995 in ProcessClientReadInterrupt (blocked=true) at /home/andres/src/postgresql/src/backend/tcop/postgres.c:497 > #12 0x00005594c1153417 in secure_read (port=0x5594c2e7d620, ptr=0x5594c189ba60 <PqRecvBuffer>, len=8192) > > reporting a FATAL error in process of reporting a FATAL error. Yeha. > > I assume this could lead to sending out the same message quite a few > times. This seems like it needs to be fixed in elog.c. ISTM that at the very least we ought to HOLD_INTERRUPTS() before the EmitErrorReport() for FATAL. Greetings, Andres Freund
pgsql-hackers by date: