Thread: Assertion failure in WaitForWALToBecomeAvailable state machine
Hi, The problem is that whenever we are going for streaming we always set XLogCtl->InstallXLogFileSegmentActive to true, but while switching from streaming to archive we do not always reset it so it hits assertion in some cases. Basically we reset it inside XLogShutdownWalRcv() but while switching from the streaming mode we only call it conditionally if WalRcvStreaming(). But it is very much possible that even before we call WalRcvStreaming() the walreceiver might have set alrcv->walRcvState to WALRCV_STOPPED. So now WalRcvStreaming() will return false. So I agree now we do not want to really shut down the walreceiver but who will reset the flag? I just ran some tests on primary and attached the walreceiver to gdb and waited for it to exit with timeout and then the recovery process hit the assertion. 2022-02-11 14:33:56.976 IST [60978] FATAL: terminating walreceiver due to timeout cp: cannot stat ‘/home/dilipkumar/work/PG/install/bin/wal_archive/00000002.history’: No such file or directory 2022-02-11 14:33:57.002 IST [60973] LOG: restored log file "000000010000000000000003" from archive TRAP: FailedAssertion("!XLogCtl->InstallXLogFileSegmentActive", File: "xlog.c", Line: 3823, PID: 60973) I have just applied a quick fix and that solved the issue, basically if the last failed source was streaming and the WalRcvStreaming() is false then just reset this flag. @@ -12717,6 +12717,12 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, */ if (WalRcvStreaming()) XLogShutdownWalRcv(); + else + { + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); + XLogCtl->InstallXLogFileSegmentActive = false; + LWLockRelease(ControlFileLock); + } -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Fri, Feb 11, 2022 at 3:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > Hi, > > The problem is that whenever we are going for streaming we always set > XLogCtl->InstallXLogFileSegmentActive to true, but while switching > from streaming to archive we do not always reset it so it hits > assertion in some cases. Basically we reset it inside > XLogShutdownWalRcv() but while switching from the streaming mode we > only call it conditionally if WalRcvStreaming(). But it is very much > possible that even before we call WalRcvStreaming() the walreceiver > might have set alrcv->walRcvState to WALRCV_STOPPED. So now > WalRcvStreaming() will return false. So I agree now we do not want to > really shut down the walreceiver but who will reset the flag? > > I just ran some tests on primary and attached the walreceiver to gdb > and waited for it to exit with timeout and then the recovery process > hit the assertion. > > 2022-02-11 14:33:56.976 IST [60978] FATAL: terminating walreceiver > due to timeout > cp: cannot stat > ‘/home/dilipkumar/work/PG/install/bin/wal_archive/00000002.history’: > No such file or directory > 2022-02-11 14:33:57.002 IST [60973] LOG: restored log file > "000000010000000000000003" from archive > TRAP: FailedAssertion("!XLogCtl->InstallXLogFileSegmentActive", File: > "xlog.c", Line: 3823, PID: 60973) > > I have just applied a quick fix and that solved the issue, basically > if the last failed source was streaming and the WalRcvStreaming() is > false then just reset this flag. IIUC, the issue can happen while the walreceiver failed to get WAL from primary for whatever reasons and its status is not WALRCV_STOPPING or WALRCV_STOPPED, and the startup process moved ahead in WaitForWALToBecomeAvailable for reading from archive which ends up in this assertion failure. ITSM, a rare scenario and it depends on what walreceiver does between failure to get WAL from primary and updating status to WALRCV_STOPPING or WALRCV_STOPPED. If the above race condition is a serious problem, if one thinks at least it is a problem at all, that needs to be fixed. I don't think just making InstallXLogFileSegmentActive false is enough. By looking at the comment [1], it doesn't make sense to move ahead for restoring from the archive location without the WAL receiver fully stopped. IMO, the real fix is to just remove WalRcvStreaming() and call XLogShutdownWalRcv() unconditionally. Anyways, we have the Assert(!WalRcvStreaming()); down below. I don't think it will create any problem. [1] /* * Before we leave XLOG_FROM_STREAM state, make sure that * walreceiver is not active, so that it won't overwrite * WAL that we restore from archive. */ if (WalRcvStreaming()) XLogShutdownWalRcv(); Regards, Bharath Rupireddy.
On Fri, Feb 11, 2022 at 6:22 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Fri, Feb 11, 2022 at 3:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > IIUC, the issue can happen while the walreceiver failed to get WAL > from primary for whatever reasons and its status is not > WALRCV_STOPPING or WALRCV_STOPPED, and the startup process moved ahead > in WaitForWALToBecomeAvailable for reading from archive which ends up > in this assertion failure. ITSM, a rare scenario and it depends on > what walreceiver does between failure to get WAL from primary and > updating status to WALRCV_STOPPING or WALRCV_STOPPED. > > If the above race condition is a serious problem, if one thinks at > least it is a problem at all, that needs to be fixed. I don't think we can design a software which has open race conditions even though they are rarely occurring :) I don't think > just making InstallXLogFileSegmentActive false is enough. By looking > at the comment [1], it doesn't make sense to move ahead for restoring > from the archive location without the WAL receiver fully stopped. > IMO, the real fix is to just remove WalRcvStreaming() and call > XLogShutdownWalRcv() unconditionally. Anyways, we have the > Assert(!WalRcvStreaming()); down below. I don't think it will create > any problem. If WalRcvStreaming() is returning false that means walreceiver is already stopped so we don't need to shutdown it externally. I think like we are setting this flag outside start streaming we can reset it also outside XLogShutdownWalRcv. Or I am fine even if we call XLogShutdownWalRcv() because if walreceiver is stopped it will just reset the flag we want it to reset and it will do nothing else. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Fri, Feb 11, 2022 at 6:31 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Feb 11, 2022 at 6:22 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > On Fri, Feb 11, 2022 at 3:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > IIUC, the issue can happen while the walreceiver failed to get WAL > > from primary for whatever reasons and its status is not > > WALRCV_STOPPING or WALRCV_STOPPED, and the startup process moved ahead > > in WaitForWALToBecomeAvailable for reading from archive which ends up > > in this assertion failure. ITSM, a rare scenario and it depends on > > what walreceiver does between failure to get WAL from primary and > > updating status to WALRCV_STOPPING or WALRCV_STOPPED. > > > > If the above race condition is a serious problem, if one thinks at > > least it is a problem at all, that needs to be fixed. > > I don't think we can design a software which has open race conditions > even though they are rarely occurring :) Yes. > I don't think > > just making InstallXLogFileSegmentActive false is enough. By looking > > at the comment [1], it doesn't make sense to move ahead for restoring > > from the archive location without the WAL receiver fully stopped. > > IMO, the real fix is to just remove WalRcvStreaming() and call > > XLogShutdownWalRcv() unconditionally. Anyways, we have the > > Assert(!WalRcvStreaming()); down below. I don't think it will create > > any problem. > > If WalRcvStreaming() is returning false that means walreceiver is > already stopped so we don't need to shutdown it externally. I think > like we are setting this flag outside start streaming we can reset it > also outside XLogShutdownWalRcv. Or I am fine even if we call > XLogShutdownWalRcv() because if walreceiver is stopped it will just > reset the flag we want it to reset and it will do nothing else. As I said, I'm okay with just calling XLogShutdownWalRcv() unconditionally as it just returns in case walreceiver has already stopped and updates the InstallXLogFileSegmentActive flag to false. Let's also hear what other hackers have to say about this. Regards, Bharath Rupireddy.
At Fri, 11 Feb 2022 22:25:49 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > > I don't think > > > just making InstallXLogFileSegmentActive false is enough. By looking > > > at the comment [1], it doesn't make sense to move ahead for restoring > > > from the archive location without the WAL receiver fully stopped. > > > IMO, the real fix is to just remove WalRcvStreaming() and call > > > XLogShutdownWalRcv() unconditionally. Anyways, we have the > > > Assert(!WalRcvStreaming()); down below. I don't think it will create > > > any problem. > > > > If WalRcvStreaming() is returning false that means walreceiver is > > already stopped so we don't need to shutdown it externally. I think > > like we are setting this flag outside start streaming we can reset it > > also outside XLogShutdownWalRcv. Or I am fine even if we call > > XLogShutdownWalRcv() because if walreceiver is stopped it will just > > reset the flag we want it to reset and it will do nothing else. > > As I said, I'm okay with just calling XLogShutdownWalRcv() > unconditionally as it just returns in case walreceiver has already > stopped and updates the InstallXLogFileSegmentActive flag to false. > > Let's also hear what other hackers have to say about this. Firstly, good catch:) And the direction seems right. It seems like an overlook of cc2c7d65fc. We cannot install new wal segments only while we're in archive recovery. Conversely, we must turn off it when entering archive recovery (not exiting streaming recovery). So, *I* feel like to do that at the beginning of XLOG_FROM_ARCHIVE/PG_WAL rather than the end of XLOG_FROM_STREAM. (And I would like to remove XLogShutDownWalRcv() and turn off the flag in StartupXLOG explicitly, but it would be overdone.) --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -12800,6 +12800,16 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, */ Assert(!WalRcvStreaming()); + /* + * WAL segment installation conflicts with archive + * recovery. Make sure it is turned off. XLogShutdownWalRcv() + * does that but it is not done when the process has voluntary + * exited for example for replication timeout. + */ + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); + XLogCtl->InstallXLogFileSegmentActive = false; + LWLockRelease(ControlFileLock); + /* Close any old file we might have open. */ if (readFile >= 0) regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Mon, Feb 14, 2022 at 1:44 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Fri, 11 Feb 2022 22:25:49 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > > > I don't think > > > > just making InstallXLogFileSegmentActive false is enough. By looking > > > > at the comment [1], it doesn't make sense to move ahead for restoring > > > > from the archive location without the WAL receiver fully stopped. > > > > IMO, the real fix is to just remove WalRcvStreaming() and call > > > > XLogShutdownWalRcv() unconditionally. Anyways, we have the > > > > Assert(!WalRcvStreaming()); down below. I don't think it will create > > > > any problem. > > > > > > If WalRcvStreaming() is returning false that means walreceiver is > > > already stopped so we don't need to shutdown it externally. I think > > > like we are setting this flag outside start streaming we can reset it > > > also outside XLogShutdownWalRcv. Or I am fine even if we call > > > XLogShutdownWalRcv() because if walreceiver is stopped it will just > > > reset the flag we want it to reset and it will do nothing else. > > > > As I said, I'm okay with just calling XLogShutdownWalRcv() > > unconditionally as it just returns in case walreceiver has already > > stopped and updates the InstallXLogFileSegmentActive flag to false. > > > > Let's also hear what other hackers have to say about this. > > Firstly, good catch:) And the direction seems right. > > It seems like an overlook of cc2c7d65fc. We cannot install new wal > segments only while we're in archive recovery. Conversely, we must > turn off it when entering archive recovery (not exiting streaming > recovery). So, *I* feel like to do that at the beginning of > XLOG_FROM_ARCHIVE/PG_WAL rather than the end of XLOG_FROM_STREAM. > > (And I would like to remove XLogShutDownWalRcv() and turn off the flag > in StartupXLOG explicitly, but it would be overdone.) > > --- a/src/backend/access/transam/xlog.c > +++ b/src/backend/access/transam/xlog.c > @@ -12800,6 +12800,16 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, > */ > Assert(!WalRcvStreaming()); > > + /* > + * WAL segment installation conflicts with archive > + * recovery. Make sure it is turned off. XLogShutdownWalRcv() > + * does that but it is not done when the process has voluntary > + * exited for example for replication timeout. > + */ > + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); > + XLogCtl->InstallXLogFileSegmentActive = false; > + LWLockRelease(ControlFileLock); > + > /* Close any old file we might have open. */ > if (readFile >= 0) Today I encountered the assertion failure [2] twice while working on another patch [1]. The pattern seems to be that the walreceiver got killed or crashed and set it's state to WALRCV_STOPPING or WALRCV_STOPPED by the team the WAL state machine moves to archive and hence the following XLogShutdownWalRcv() code will not get hit: /* * Before we leave XLOG_FROM_STREAM state, make sure that * walreceiver is not active, so that it won't overwrite * WAL that we restore from archive. */ if (WalRcvStreaming()) ShutdownWalRcv(); I agree with Kyotaro-san to reset InstallXLogFileSegmentActive before entering into the archive mode. Hence I tweaked the code introduced by the following commit a bit, the result v1 patch is attached herewith. Please review it. commit cc2c7d65fc27e877c9f407587b0b92d46cd6dd16 Author: Noah Misch <noah@leadboat.com> Date: Mon Jun 28 18:34:56 2021 -0700 Skip WAL recycling and preallocation during archive recovery. [1] https://www.postgresql.org/message-id/CALj2ACUYz1z6QPduGn5gguCkfd-ko44j4hKcOMtp6fzv9xEWgw%40mail.gmail.com [2] 2022-08-11 05:10:29.051 UTC [1487086] FATAL: terminating walreceiver due to timeout cp: cannot stat '/home/ubuntu/archived_wal/00000002.history': No such file or directory 2022-08-11 05:10:29.054 UTC [1487075] LOG: switched WAL source from stream to archive after failure 2022-08-11 05:10:29.067 UTC [1487075] LOG: restored log file "000000010000000000000034" from archive TRAP: FailedAssertion("!IsInstallXLogFileSegmentActive()", File: "xlogrecovery.c", Line: 4149, PID: 1487075) postgres: startup waiting for 000000010000000000000034(ExceptionalCondition+0xd6)[0x559d20e6deb1] postgres: startup waiting for 000000010000000000000034(+0x1e4e14)[0x559d20813e14] postgres: startup waiting for 000000010000000000000034(+0x1e50c0)[0x559d208140c0] postgres: startup waiting for 000000010000000000000034(+0x1e4354)[0x559d20813354] postgres: startup waiting for 000000010000000000000034(+0x1e3946)[0x559d20812946] postgres: startup waiting for 000000010000000000000034(+0x1dbc58)[0x559d2080ac58] postgres: startup waiting for 000000010000000000000034(+0x1db3ce)[0x559d2080a3ce] postgres: startup waiting for 000000010000000000000034(XLogReadAhead+0x3d)[0x559d2080aa23] postgres: startup waiting for 000000010000000000000034(+0x1d9713)[0x559d20808713] postgres: startup waiting for 000000010000000000000034(+0x1d909c)[0x559d2080809c] postgres: startup waiting for 000000010000000000000034(XLogPrefetcherReadRecord+0xf9)[0x559d208092a2] postgres: startup waiting for 000000010000000000000034(+0x1e3427)[0x559d20812427] postgres: startup waiting for 000000010000000000000034(PerformWalRecovery+0x465)[0x559d2080fd0f] postgres: startup waiting for 000000010000000000000034(StartupXLOG+0x987)[0x559d207fc732] postgres: startup waiting for 000000010000000000000034(StartupProcessMain+0xf8)[0x559d20bc3990] postgres: startup waiting for 000000010000000000000034(AuxiliaryProcessMain+0x1ae)[0x559d20bb62cf] postgres: startup waiting for 000000010000000000000034(+0x593020)[0x559d20bc2020] postgres: startup waiting for 000000010000000000000034(+0x591ebb)[0x559d20bc0ebb] postgres: startup waiting for 000000010000000000000034(+0x590907)[0x559d20bbf907] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7fe518b26520] /lib/x86_64-linux-gnu/libc.so.6(__select+0xbd)[0x7fe518bff74d] postgres: startup waiting for 000000010000000000000034(+0x58e1bd)[0x559d20bbd1bd] postgres: startup waiting for 000000010000000000000034(PostmasterMain+0x14cc)[0x559d20bbcaa2] postgres: startup waiting for 000000010000000000000034(+0x44a752)[0x559d20a79752] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7fe518b0dd90] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7fe518b0de40] postgres: startup waiting for 000000010000000000000034(_start+0x25)[0x559d206fe5b5] 2022-08-11 05:10:29.161 UTC [1486413] LOG: startup process (PID 1487075) was terminated by signal 6: Aborted 2022-08-11 05:10:29.161 UTC [1486413] LOG: terminating any other active server processes 2022-08-11 05:10:29.169 UTC [1486413] LOG: shutting down due to startup process failure 2022-08-11 05:10:29.181 UTC [1486413] LOG: database system is shut down 2022-08-11 05:34:54.234 UTC [1509079] LOG: switched WAL source from stream to archive after failure 2022-08-11 05:34:54.234 UTC [1509079] LOG: current WAL source is archive and last WAL source is stream, standby mode 1 2022-08-11 05:34:54.248 UTC [1509079] LOG: restored log file "000000010000000000000003" from archive TRAP: FailedAssertion("!IsInstallXLogFileSegmentActive()", File: "xlogrecovery.c", Line: 4155, PID: 1509079) postgres: startup waiting for 000000010000000000000003(ExceptionalCondition+0xd6)[0x55e9cb1cbf2a] postgres: startup waiting for 000000010000000000000003(+0x1e4e8d)[0x55e9cab71e8d] postgres: startup waiting for 000000010000000000000003(+0x1e5139)[0x55e9cab72139] postgres: startup waiting for 000000010000000000000003(+0x1e43cd)[0x55e9cab713cd] postgres: startup waiting for 000000010000000000000003(+0x1e3946)[0x55e9cab70946] postgres: startup waiting for 000000010000000000000003(+0x1dbc58)[0x55e9cab68c58] postgres: startup waiting for 000000010000000000000003(+0x1db0b7)[0x55e9cab680b7] postgres: startup waiting for 000000010000000000000003(XLogReadAhead+0x3d)[0x55e9cab68a23] postgres: startup waiting for 000000010000000000000003(+0x1d9713)[0x55e9cab66713] postgres: startup waiting for 000000010000000000000003(+0x1d909c)[0x55e9cab6609c] postgres: startup waiting for 000000010000000000000003(XLogPrefetcherReadRecord+0xf9)[0x55e9cab672a2] postgres: startup waiting for 000000010000000000000003(+0x1e3427)[0x55e9cab70427] postgres: startup waiting for 000000010000000000000003(+0x1e4911)[0x55e9cab71911] postgres: startup waiting for 000000010000000000000003(InitWalRecovery+0xaf4)[0x55e9cab6bc4e] postgres: startup waiting for 000000010000000000000003(StartupXLOG+0x4af)[0x55e9cab5a25a] postgres: startup waiting for 000000010000000000000003(StartupProcessMain+0xf8)[0x55e9caf21a09] postgres: startup waiting for 000000010000000000000003(AuxiliaryProcessMain+0x1ae)[0x55e9caf14348] postgres: startup waiting for 000000010000000000000003(+0x593099)[0x55e9caf20099] postgres: startup waiting for 000000010000000000000003(PostmasterMain+0x1476)[0x55e9caf1aac5] postgres: startup waiting for 000000010000000000000003(+0x44a7cb)[0x55e9cadd77cb] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f754c287d90] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f754c287e40] postgres: startup waiting for 000000010000000000000003(_start+0x25)[0x55e9caa5c5b5] 2022-08-11 05:34:54.336 UTC [1509076] LOG: startup process (PID 1509079) was terminated by signal 6: Aborted 2022-08-11 05:34:54.336 UTC [1509076] LOG: aborting startup due to startup process failure 2022-08-11 05:34:54.337 UTC [1509076] LOG: database system is shut down -- Bharath Rupireddy RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/
Attachment
On Thu, Aug 11, 2022 at 10:06 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Today I encountered the assertion failure [2] twice while working on > another patch [1]. The pattern seems to be that the walreceiver got > killed or crashed and set it's state to WALRCV_STOPPING or > WALRCV_STOPPED by the team the WAL state machine moves to archive and > hence the following XLogShutdownWalRcv() code will not get hit: > > /* > * Before we leave XLOG_FROM_STREAM state, make sure that > * walreceiver is not active, so that it won't overwrite > * WAL that we restore from archive. > */ > if (WalRcvStreaming()) > ShutdownWalRcv(); > > I agree with Kyotaro-san to reset InstallXLogFileSegmentActive before > entering into the archive mode. Hence I tweaked the code introduced by > the following commit a bit, the result v1 patch is attached herewith. > Please review it. I added it to the current commitfest to not lose track of it: https://commitfest.postgresql.org/39/3814/. -- Bharath Rupireddy RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/
On Mon, Aug 15, 2022 at 11:30 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Thu, Aug 11, 2022 at 10:06 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > Today I encountered the assertion failure [2] twice while working on > > another patch [1]. The pattern seems to be that the walreceiver got > > killed or crashed and set it's state to WALRCV_STOPPING or > > WALRCV_STOPPED by the team the WAL state machine moves to archive and > > hence the following XLogShutdownWalRcv() code will not get hit: > > > > /* > > * Before we leave XLOG_FROM_STREAM state, make sure that > > * walreceiver is not active, so that it won't overwrite > > * WAL that we restore from archive. > > */ > > if (WalRcvStreaming()) > > ShutdownWalRcv(); > > > > I agree with Kyotaro-san to reset InstallXLogFileSegmentActive before > > entering into the archive mode. Hence I tweaked the code introduced by > > the following commit a bit, the result v1 patch is attached herewith. > > Please review it. > > I added it to the current commitfest to not lose track of it: > https://commitfest.postgresql.org/39/3814/. Today, I spent some more time on this issue, I modified the v1 patch posted upthread a bit - now resetting the InstallXLogFileSegmentActive only when the WAL source switched to archive, not every time in archive mode. I'm attaching v2 patch here with, please review it further. Just for the records - there's another report of the assertion failure at [1], many thanks to Kyotaro-san for providing consistent reproducible steps. [1] - https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Sat, Sep 10, 2022 at 07:52:01AM +0530, Bharath Rupireddy wrote: > Today, I spent some more time on this issue, I modified the v1 patch > posted upthread a bit - now resetting the InstallXLogFileSegmentActive > only when the WAL source switched to archive, not every time in > archive mode. > > I'm attaching v2 patch here with, please review it further. > > Just for the records - there's another report of the assertion failure > at [1], many thanks to Kyotaro-san for providing consistent > reproducible steps. > > [1] - https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com Well, the fact that cc2c7d6 is involved here makes this thread an open item for PG15 as far as I can see, assigned to Noah (added now in CC). While reading your last patch, I have found rather confusing that we only reset InstallXLogFileSegmentActive when the current source is the archives and it does not match the old source. This code is already complicated, and I don't think that having more assumptions in its internals is a good thing when it comes to its long-term maintenance. In short, HEAD is rather conservative when it comes to set InstallXLogFileSegmentActive, switching it only when we request streaming with RequestXLogStreaming(), but too aggressive when it comes to reset it and we want something in the middle ground. FWIW, I find better the approach taken by Horiguchi-san in [1] to reset the state before we attempt to read WAL from the archives *or* pg_wal, after we know that the last source has failed, where we know that we are not streaming yet (but recovery may be requested soon). Side note: I don't see much the point of having two routines named SetInstallXLogFileSegmentActive and ResetInstallXLogFileSegmentActive that do the opposite thing. We could just have one. [1]: https://www.postgresql.org/message-id/20220214.171428.735280610520122187.horikyota.ntt@gmail.com -- Michael
Attachment
On Mon, Sep 12, 2022 at 12:30 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Sat, Sep 10, 2022 at 07:52:01AM +0530, Bharath Rupireddy wrote: > > Today, I spent some more time on this issue, I modified the v1 patch > > posted upthread a bit - now resetting the InstallXLogFileSegmentActive > > only when the WAL source switched to archive, not every time in > > archive mode. > > > > I'm attaching v2 patch here with, please review it further. > > > > Just for the records - there's another report of the assertion failure > > at [1], many thanks to Kyotaro-san for providing consistent > > reproducible steps. > > > > [1] - https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com > > Well, the fact that cc2c7d6 is involved here makes this thread an open > item for PG15 as far as I can see, assigned to Noah (added now in > CC). Thanks. > While reading your last patch, I have found rather confusing that we > only reset InstallXLogFileSegmentActive when the current source is the > archives and it does not match the old source. This code is already > complicated, and I don't think that having more assumptions in its > internals is a good thing when it comes to its long-term maintenance. > In short, HEAD is rather conservative when it comes to set > InstallXLogFileSegmentActive, switching it only when we request > streaming with RequestXLogStreaming(), but too aggressive when it > comes to reset it and we want something in the middle ground. FWIW, I > find better the approach taken by Horiguchi-san in [1] to reset the > state before we attempt to read WAL from the archives *or* pg_wal, > after we know that the last source has failed, where we know that we > are not streaming yet (but recovery may be requested soon). > > [1]: https://www.postgresql.org/message-id/20220214.171428.735280610520122187.horikyota.ntt@gmail.com I think the fix at [1] is wrong. It basically resets InstallXLogFileSegmentActive for both XLOG_FROM_ARCHIVE and XLOG_FROM_PG_WAL, remember that we don't need WAL files preallocation and recycling just for archive. Moreover, if we were to reset just for archive there, it's too aggressive, meaning for every WAL record, we take ControlFileLock in exclusive mode to reset it. IMO, doing it once when we switch the source to archive is the correct way because it avoids frequent ControlFileLock acquisitions and makes the fix more intuitive. And we have a comment saying why we reset InstallXLogFileSegmentActive, if required we can point to the commit cc2c7d6 in the comments. > Side note: I don't see much the point of having two routines named > SetInstallXLogFileSegmentActive and ResetInstallXLogFileSegmentActive > that do the opposite thing. We could just have one. It's an implementation choice. However, I changed it in the v3 patch. Please review the attached v3 patch further. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Mon, Sep 12, 2022 at 05:58:08PM +0530, Bharath Rupireddy wrote: > On Mon, Sep 12, 2022 at 12:30 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Sat, Sep 10, 2022 at 07:52:01AM +0530, Bharath Rupireddy wrote: > > > Just for the records - there's another report of the assertion failure > > > at [1], many thanks to Kyotaro-san for providing consistent > > > reproducible steps. > > > > > > [1] - https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com I plan to use that message's patch, because it guarantees WALRCV_STOPPED at the code location being changed. Today, in the unlikely event of !WalRcvStreaming() due to WALRCV_WAITING or WALRCV_STOPPING, that code proceeds without waiting for WALRCV_STOPPED. I think WALRCV_STOPPING can't happen there. Even if WALRCV_WAITING also can't happen, it's a simplicity win to remove the need to think about that. Dilip Kumar and Nathan Bossart also stated support for that design. If WALRCV_WAITING or WALRCV_STOPPING can happen at that patch's code site, I perhaps should back-patch the change to released versions. Does anyone know whether one or both can happen? > > FWIW, I > > find better the approach taken by Horiguchi-san in [1] to reset the > > state before we attempt to read WAL from the archives *or* pg_wal, > > after we know that the last source has failed, where we know that we > > are not streaming yet (but recovery may be requested soon). > > > > [1]: https://www.postgresql.org/message-id/20220214.171428.735280610520122187.horikyota.ntt@gmail.com > > I think the fix at [1] is wrong. It basically resets > InstallXLogFileSegmentActive for both XLOG_FROM_ARCHIVE and > XLOG_FROM_PG_WAL, remember that we don't need WAL files preallocation > and recycling just for archive. Moreover, if we were to reset just for > archive there, it's too aggressive, meaning for every WAL record, we > take ControlFileLock in exclusive mode to reset it. > > IMO, doing it once when we switch the source to archive is the correct > way because it avoids frequent ControlFileLock acquisitions and makes > the fix more intuitive. And we have a comment saying why we reset > InstallXLogFileSegmentActive, if required we can point to the commit > cc2c7d6 in the comments. The startup process maintains the invariant that archive recovery and streaming recovery don't overlap in time; one stops before the other starts. As long as it achieves that, I'm not aware of a cause to care whether the flag reset happens when streaming ends vs. when archive recovery begins. If the startup process develops a defect such that it ceases to maintain that invariant, "when archive recovery begins" does have the advantage-or-disadvantage of causing a failure in non-assert builds. The other way gets only an assertion failure. Michael Paquier or Bharath Rupireddy, did you envision benefits other than that one?
On Mon, Sep 12, 2022 at 08:22:56PM -0700, Noah Misch wrote: > I plan to use that message's patch, because it guarantees WALRCV_STOPPED at > the code location being changed. Today, in the unlikely event of > !WalRcvStreaming() due to WALRCV_WAITING or WALRCV_STOPPING, that code > proceeds without waiting for WALRCV_STOPPED. I think WALRCV_STOPPING can't > happen there. Even if WALRCV_WAITING also can't happen, it's a simplicity win > to remove the need to think about that. Dilip Kumar and Nathan Bossart also > stated support for that design. I did not notice this one. Sounds pretty much right. > If WALRCV_WAITING or WALRCV_STOPPING can happen at that patch's code site, I > perhaps should back-patch the change to released versions. Does anyone know > whether one or both can happen? Hmm. I agree that this could be a good thing in the long-term. > The startup process maintains the invariant that archive recovery and > streaming recovery don't overlap in time; one stops before the other starts. > As long as it achieves that, I'm not aware of a cause to care whether the flag > reset happens when streaming ends vs. when archive recovery begins. If the > startup process develops a defect such that it ceases to maintain that > invariant, "when archive recovery begins" does have the > advantage-or-disadvantage of causing a failure in non-assert builds. The > other way gets only an assertion failure. Michael Paquier or Bharath > Rupireddy, did you envision benefits other than that one? I think that you are forgetting one case here though: a standby where standby.signal is set without restore_command and with primary_conninfo. It could move on with recovery even if the WAL receiver is unstable when something external pushes more WAL segments to the standby's pg_wal/. So my point of upthread is that we should make sure that standby.signal+primary_conninfo resets the flag state when the WAL receiver stops streaming in this case as well. The proposed v1-0001-Do-not-skip-calling-XLogShutdownWalRcv.patch would achieve that, because it does not change the state of InstallXLogFileSegmentActive depending on the source being XLOG_FROM_ARCHIVE. So I'm fine with what you want to do. -- Michael
Attachment
On Tue, Sep 13, 2022 at 8:52 AM Noah Misch <noah@leadboat.com> wrote: > > > > > [1] - https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com > > I plan to use that message's patch, because it guarantees WALRCV_STOPPED at > the code location being changed. Today, in the unlikely event of > !WalRcvStreaming() due to WALRCV_WAITING or WALRCV_STOPPING, that code > proceeds without waiting for WALRCV_STOPPED. Hm. That was the original fix [2] proposed and it works. The concern is that XLogShutdownWalRcv() does a bunch of work via ShutdownWalRcv() - it calls ConditionVariablePrepareToSleep(), ConditionVariableCancelSleep() (has lock 2 acquisitions and requisitions) and 1 function call WalRcvRunning()) even for WALRCV_STOPPED case, all this is unnecessary IMO when we determine the walreceiver is state is already WALRCV_STOPPED. I think we can do something like [3] to allay this concern and go with the fix proposed at [1] unconditionally calling XLogShutdownWalRcv(). > If WALRCV_WAITING or WALRCV_STOPPING can happen at that patch's code site, I > perhaps should back-patch the change to released versions. Does anyone know > whether one or both can happen? IMO, we must back-patch to the version where cc2c7d65fc27e877c9f407587b0b92d46cd6dd16 got introduced irrespective of any of the above happening. Thoughts? [1] - https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com [2] - https://www.postgresql.org/message-id/flat/CALj2ACUoBWbaFo_t0gew%2Bx6n0V%2BmpvB_23HLvsVD9abgCShV5A%40mail.gmail.com#e762ee94cbbe32a0b8c72c60793626b3 [3] -diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c index 90798b9d53..3487793c7a 100644 --- a/src/backend/replication/walreceiverfuncs.c +++ b/src/backend/replication/walreceiverfuncs.c @@ -181,6 +181,7 @@ ShutdownWalRcv(void) WalRcvData *walrcv = WalRcv; pid_t walrcvpid = 0; bool stopped = false; + bool was_stopped = false; /* * Request walreceiver to stop. Walreceiver will switch to WALRCV_STOPPED @@ -191,6 +192,7 @@ ShutdownWalRcv(void) switch (walrcv->walRcvState) { case WALRCV_STOPPED: + was_stopped = true; break; case WALRCV_STARTING: walrcv->walRcvState = WALRCV_STOPPED; @@ -208,6 +210,10 @@ ShutdownWalRcv(void) } SpinLockRelease(&walrcv->mutex); + /* Quick exit if walreceiver was already stopped. */ + if (was_stopped) + return; + /* Unnecessary but consistent. */ if (stopped) ConditionVariableBroadcast(&walrcv->walRcvStoppedCV); -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
At Tue, 13 Sep 2022 11:56:16 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > On Tue, Sep 13, 2022 at 8:52 AM Noah Misch <noah@leadboat.com> wrote: > > > > > > > [1] - https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com > > > > I plan to use that message's patch, because it guarantees WALRCV_STOPPED at > > the code location being changed. Today, in the unlikely event of > > !WalRcvStreaming() due to WALRCV_WAITING or WALRCV_STOPPING, that code > > proceeds without waiting for WALRCV_STOPPED. > > Hm. That was the original fix [2] proposed and it works. The concern (Mmm. sorry for omitting that.) > is that XLogShutdownWalRcv() does a bunch of work via ShutdownWalRcv() > - it calls ConditionVariablePrepareToSleep(), Anyway the code path is executed in almost all cases because the same assertion fires otherwise. So I don't see a problem if we do the bunch of synchronization things also in that rare case. I'm not sure we want to do [3]. > > If WALRCV_WAITING or WALRCV_STOPPING can happen at that patch's code site, I > > perhaps should back-patch the change to released versions. Does anyone know > > whether one or both can happen? > > IMO, we must back-patch to the version where > cc2c7d65fc27e877c9f407587b0b92d46cd6dd16 got introduced irrespective > of any of the above happening. That is, PG15? I agree to that. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Tue, Sep 13, 2022 at 3:56 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > > is that XLogShutdownWalRcv() does a bunch of work via ShutdownWalRcv() > > - it calls ConditionVariablePrepareToSleep(), > > Anyway the code path is executed in almost all cases because the same > assertion fires otherwise. So I don't see a problem if we do the bunch > of synchronization things also in that rare case. I'm not sure we > want to do [3]. IMO, we don't need to let ShutdownWalRcv() to do extra work of ConditionVariablePrepareToSleep() - WalRcvRunning() - ConditionVariableCancelSleep() for WALRCV_STOPPED cases when we know the walreceiver status before - even though it doesn't have any problems per se and at that place in the code the WALRCV_STOPPED cases are more frequent than any other walreceiver cases. Having said that, let's also hear from other hackers. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Tue, Sep 13, 2022 at 11:56:16AM +0530, Bharath Rupireddy wrote: > On Tue, Sep 13, 2022 at 8:52 AM Noah Misch <noah@leadboat.com> wrote: > > > > > [1] - https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com > > > > I plan to use that message's patch, because it guarantees WALRCV_STOPPED at > > the code location being changed. Today, in the unlikely event of > > !WalRcvStreaming() due to WALRCV_WAITING or WALRCV_STOPPING, that code > > proceeds without waiting for WALRCV_STOPPED. Pushed that way. > Hm. That was the original fix [2] proposed and it works. The concern > is that XLogShutdownWalRcv() does a bunch of work via ShutdownWalRcv() > - it calls ConditionVariablePrepareToSleep(), > ConditionVariableCancelSleep() (has lock 2 acquisitions and > requisitions) and 1 function call WalRcvRunning()) even for > WALRCV_STOPPED case, all this is unnecessary IMO when we determine the > walreceiver is state is already WALRCV_STOPPED. That's fine. If we're reaching this code at high frequency, that implies we're also forking walreceiver processes at high frequency. This code would be a trivial part of the overall cost. > > If WALRCV_WAITING or WALRCV_STOPPING can happen at that patch's code site, I > > perhaps should back-patch the change to released versions. Does anyone know > > whether one or both can happen? If anyone discovers such cases later, we can extend the back-patch then. > IMO, we must back-patch to the version where > cc2c7d65fc27e877c9f407587b0b92d46cd6dd16 got introduced irrespective > of any of the above happening. Correct. The sentences were about *released* versions, not v15.
On Thu, Sep 15, 2022 at 06:58:43AM -0700, Noah Misch wrote: > Pushed that way. Thanks for taking care of it, Noah. -- Michael
Attachment
At Fri, 16 Sep 2022 09:05:46 +0900, Michael Paquier <michael@paquier.xyz> wrote in > On Thu, Sep 15, 2022 at 06:58:43AM -0700, Noah Misch wrote: > > Pushed that way. > > Thanks for taking care of it, Noah. +1. Thanks. regards. -- Kyotaro Horiguchi NTT Open Source Software Center