Thread: Assertion failure in WaitForWALToBecomeAvailable state machine

Assertion failure in WaitForWALToBecomeAvailable state machine

From

Dilip Kumar

Date:

11 February 2022, 10:02:45

Hi,

The problem is that whenever we are going for streaming we always set
XLogCtl->InstallXLogFileSegmentActive to true, but while switching
from streaming to archive we do not always reset it so it hits
assertion in some cases. Basically we reset it inside
XLogShutdownWalRcv() but while switching from the streaming mode we
only call it conditionally if WalRcvStreaming().  But it is very much
possible that even before we call WalRcvStreaming() the walreceiver
might have set alrcv->walRcvState to WALRCV_STOPPED.  So now
WalRcvStreaming() will return false.  So I agree now we do not want to
really shut down the walreceiver but who will reset the flag?

I just ran some tests on primary and attached the walreceiver to gdb
and waited for it to exit with timeout and then the recovery process
hit the assertion.

2022-02-11 14:33:56.976 IST [60978] FATAL:  terminating walreceiver
due to timeout
cp: cannot stat
‘/home/dilipkumar/work/PG/install/bin/wal_archive/00000002.history’:
No such file or directory
2022-02-11 14:33:57.002 IST [60973] LOG:  restored log file
"000000010000000000000003" from archive
TRAP: FailedAssertion("!XLogCtl->InstallXLogFileSegmentActive", File:
"xlog.c", Line: 3823, PID: 60973)

I have just applied a quick fix and that solved the issue, basically
if the last failed source was streaming and the WalRcvStreaming() is
false then just reset this flag.

@@ -12717,6 +12717,12 @@ WaitForWALToBecomeAvailable(XLogRecPtr
RecPtr, bool randAccess,
                                         */
                                        if (WalRcvStreaming())
                                                XLogShutdownWalRcv();
+                                       else
+                                       {
+
LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
XLogCtl->InstallXLogFileSegmentActive = false;
+                                               LWLockRelease(ControlFileLock);
+                                       }



--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Bharath Rupireddy

Date:

11 February 2022, 12:51:53

On Fri, Feb 11, 2022 at 3:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> Hi,
>
> The problem is that whenever we are going for streaming we always set
> XLogCtl->InstallXLogFileSegmentActive to true, but while switching
> from streaming to archive we do not always reset it so it hits
> assertion in some cases. Basically we reset it inside
> XLogShutdownWalRcv() but while switching from the streaming mode we
> only call it conditionally if WalRcvStreaming().  But it is very much
> possible that even before we call WalRcvStreaming() the walreceiver
> might have set alrcv->walRcvState to WALRCV_STOPPED.  So now
> WalRcvStreaming() will return false.  So I agree now we do not want to
> really shut down the walreceiver but who will reset the flag?
>
> I just ran some tests on primary and attached the walreceiver to gdb
> and waited for it to exit with timeout and then the recovery process
> hit the assertion.
>
> 2022-02-11 14:33:56.976 IST [60978] FATAL:  terminating walreceiver
> due to timeout
> cp: cannot stat
> ‘/home/dilipkumar/work/PG/install/bin/wal_archive/00000002.history’:
> No such file or directory
> 2022-02-11 14:33:57.002 IST [60973] LOG:  restored log file
> "000000010000000000000003" from archive
> TRAP: FailedAssertion("!XLogCtl->InstallXLogFileSegmentActive", File:
> "xlog.c", Line: 3823, PID: 60973)
>
> I have just applied a quick fix and that solved the issue, basically
> if the last failed source was streaming and the WalRcvStreaming() is
> false then just reset this flag.

IIUC, the issue can happen while the walreceiver failed to get WAL
from primary for whatever reasons and its status is not
WALRCV_STOPPING or WALRCV_STOPPED, and the startup process moved ahead
in WaitForWALToBecomeAvailable for reading from archive which ends up
in this assertion failure. ITSM, a rare scenario and it depends on
what walreceiver does between failure to get WAL from primary and
updating status to WALRCV_STOPPING or WALRCV_STOPPED.

If the above race condition is a serious problem, if one thinks at
least it is a problem at all, that needs to be fixed. I don't think
just making InstallXLogFileSegmentActive false is enough. By looking
at the comment [1], it doesn't make sense to move ahead for restoring
from the archive location without the WAL receiver fully stopped.
IMO, the real fix is to just remove WalRcvStreaming() and call
XLogShutdownWalRcv() unconditionally. Anyways, we have the
Assert(!WalRcvStreaming()); down below. I don't think it will create
any problem.

[1]
                    /*
                     * Before we leave XLOG_FROM_STREAM state, make sure that
                     * walreceiver is not active, so that it won't overwrite
                     * WAL that we restore from archive.
                     */
                    if (WalRcvStreaming())
                        XLogShutdownWalRcv();

Regards,
Bharath Rupireddy.

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Dilip Kumar

Date:

11 February 2022, 13:00:43

On Fri, Feb 11, 2022 at 6:22 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Fri, Feb 11, 2022 at 3:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >

> IIUC, the issue can happen while the walreceiver failed to get WAL
> from primary for whatever reasons and its status is not
> WALRCV_STOPPING or WALRCV_STOPPED, and the startup process moved ahead
> in WaitForWALToBecomeAvailable for reading from archive which ends up
> in this assertion failure. ITSM, a rare scenario and it depends on
> what walreceiver does between failure to get WAL from primary and
> updating status to WALRCV_STOPPING or WALRCV_STOPPED.
>
> If the above race condition is a serious problem, if one thinks at
> least it is a problem at all, that needs to be fixed.

I don't think we can design a software which has open race conditions
even though they are rarely occurring :)

I don't think
> just making InstallXLogFileSegmentActive false is enough. By looking
> at the comment [1], it doesn't make sense to move ahead for restoring
> from the archive location without the WAL receiver fully stopped.
> IMO, the real fix is to just remove WalRcvStreaming() and call
> XLogShutdownWalRcv() unconditionally. Anyways, we have the
> Assert(!WalRcvStreaming()); down below. I don't think it will create
> any problem.

If WalRcvStreaming() is returning false that means walreceiver is
already stopped so we don't need to shutdown it externally.  I think
like we are setting this flag outside start streaming we can reset it
also outside XLogShutdownWalRcv.  Or I am fine even if we call
XLogShutdownWalRcv() because if walreceiver is stopped it will just
reset the flag we want it to reset and it will do nothing else.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Bharath Rupireddy

Date:

11 February 2022, 16:55:49

On Fri, Feb 11, 2022 at 6:31 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Feb 11, 2022 at 6:22 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Fri, Feb 11, 2022 at 3:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
>
> > IIUC, the issue can happen while the walreceiver failed to get WAL
> > from primary for whatever reasons and its status is not
> > WALRCV_STOPPING or WALRCV_STOPPED, and the startup process moved ahead
> > in WaitForWALToBecomeAvailable for reading from archive which ends up
> > in this assertion failure. ITSM, a rare scenario and it depends on
> > what walreceiver does between failure to get WAL from primary and
> > updating status to WALRCV_STOPPING or WALRCV_STOPPED.
> >
> > If the above race condition is a serious problem, if one thinks at
> > least it is a problem at all, that needs to be fixed.
>
> I don't think we can design a software which has open race conditions
> even though they are rarely occurring :)

Yes.

> I don't think
> > just making InstallXLogFileSegmentActive false is enough. By looking
> > at the comment [1], it doesn't make sense to move ahead for restoring
> > from the archive location without the WAL receiver fully stopped.
> > IMO, the real fix is to just remove WalRcvStreaming() and call
> > XLogShutdownWalRcv() unconditionally. Anyways, we have the
> > Assert(!WalRcvStreaming()); down below. I don't think it will create
> > any problem.
>
> If WalRcvStreaming() is returning false that means walreceiver is
> already stopped so we don't need to shutdown it externally.  I think
> like we are setting this flag outside start streaming we can reset it
> also outside XLogShutdownWalRcv.  Or I am fine even if we call
> XLogShutdownWalRcv() because if walreceiver is stopped it will just
> reset the flag we want it to reset and it will do nothing else.

As I said, I'm okay with just calling XLogShutdownWalRcv()
unconditionally as it just returns in case walreceiver has already
stopped and updates the InstallXLogFileSegmentActive flag to false.

Let's also hear what other hackers have to say about this.

Regards,
Bharath Rupireddy.

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Kyotaro Horiguchi

Date:

14 February 2022, 08:14:28

At Fri, 11 Feb 2022 22:25:49 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in 
> > I don't think
> > > just making InstallXLogFileSegmentActive false is enough. By looking
> > > at the comment [1], it doesn't make sense to move ahead for restoring
> > > from the archive location without the WAL receiver fully stopped.
> > > IMO, the real fix is to just remove WalRcvStreaming() and call
> > > XLogShutdownWalRcv() unconditionally. Anyways, we have the
> > > Assert(!WalRcvStreaming()); down below. I don't think it will create
> > > any problem.
> >
> > If WalRcvStreaming() is returning false that means walreceiver is
> > already stopped so we don't need to shutdown it externally.  I think
> > like we are setting this flag outside start streaming we can reset it
> > also outside XLogShutdownWalRcv.  Or I am fine even if we call
> > XLogShutdownWalRcv() because if walreceiver is stopped it will just
> > reset the flag we want it to reset and it will do nothing else.
> 
> As I said, I'm okay with just calling XLogShutdownWalRcv()
> unconditionally as it just returns in case walreceiver has already
> stopped and updates the InstallXLogFileSegmentActive flag to false.
> 
> Let's also hear what other hackers have to say about this.

Firstly, good catch:)  And the direction seems right.

It seems like an overlook of cc2c7d65fc. We cannot install new wal
segments only while we're in archive recovery.  Conversely, we must
turn off it when entering archive recovery (not exiting streaming
recovery).  So, *I* feel like to do that at the beginning of
XLOG_FROM_ARCHIVE/PG_WAL rather than the end of XLOG_FROM_STREAM.

(And I would like to remove XLogShutDownWalRcv() and turn off the flag
 in StartupXLOG explicitly, but it would be overdone.)

--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -12800,6 +12800,16 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
                  */
                 Assert(!WalRcvStreaming());
 
+                /*
+                 * WAL segment installation conflicts with archive
+                 * recovery. Make sure it is turned off.  XLogShutdownWalRcv()
+                 * does that but it is not done when the process has voluntary
+                 * exited for example for replication timeout.
+                 */
+                LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+                XLogCtl->InstallXLogFileSegmentActive = false;
+                LWLockRelease(ControlFileLock);
+
                 /* Close any old file we might have open. */
                 if (readFile >= 0)


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Bharath Rupireddy

Date:

11 August 2022, 16:36:24

On Mon, Feb 14, 2022 at 1:44 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Fri, 11 Feb 2022 22:25:49 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in
> > > I don't think
> > > > just making InstallXLogFileSegmentActive false is enough. By looking
> > > > at the comment [1], it doesn't make sense to move ahead for restoring
> > > > from the archive location without the WAL receiver fully stopped.
> > > > IMO, the real fix is to just remove WalRcvStreaming() and call
> > > > XLogShutdownWalRcv() unconditionally. Anyways, we have the
> > > > Assert(!WalRcvStreaming()); down below. I don't think it will create
> > > > any problem.
> > >
> > > If WalRcvStreaming() is returning false that means walreceiver is
> > > already stopped so we don't need to shutdown it externally.  I think
> > > like we are setting this flag outside start streaming we can reset it
> > > also outside XLogShutdownWalRcv.  Or I am fine even if we call
> > > XLogShutdownWalRcv() because if walreceiver is stopped it will just
> > > reset the flag we want it to reset and it will do nothing else.
> >
> > As I said, I'm okay with just calling XLogShutdownWalRcv()
> > unconditionally as it just returns in case walreceiver has already
> > stopped and updates the InstallXLogFileSegmentActive flag to false.
> >
> > Let's also hear what other hackers have to say about this.
>
> Firstly, good catch:)  And the direction seems right.
>
> It seems like an overlook of cc2c7d65fc. We cannot install new wal
> segments only while we're in archive recovery.  Conversely, we must
> turn off it when entering archive recovery (not exiting streaming
> recovery).  So, *I* feel like to do that at the beginning of
> XLOG_FROM_ARCHIVE/PG_WAL rather than the end of XLOG_FROM_STREAM.
>
> (And I would like to remove XLogShutDownWalRcv() and turn off the flag
>  in StartupXLOG explicitly, but it would be overdone.)
>
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
> @@ -12800,6 +12800,16 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
>                                  */
>                                 Assert(!WalRcvStreaming());
>
> +                               /*
> +                                * WAL segment installation conflicts with archive
> +                                * recovery. Make sure it is turned off.  XLogShutdownWalRcv()
> +                                * does that but it is not done when the process has voluntary
> +                                * exited for example for replication timeout.
> +                                */
> +                               LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
> +                               XLogCtl->InstallXLogFileSegmentActive = false;
> +                               LWLockRelease(ControlFileLock);
> +
>                                 /* Close any old file we might have open. */
>                                 if (readFile >= 0)

Today I encountered the assertion failure [2] twice while working on
another patch [1]. The pattern seems to be that the walreceiver got
killed or crashed and set it's state to WALRCV_STOPPING or
WALRCV_STOPPED by the team the WAL state machine moves to archive and
hence the following XLogShutdownWalRcv() code will not get hit:

                    /*
                     * Before we leave XLOG_FROM_STREAM state, make sure that
                     * walreceiver is not active, so that it won't overwrite
                     * WAL that we restore from archive.
                     */
                    if (WalRcvStreaming())
                        ShutdownWalRcv();

I agree with Kyotaro-san to reset InstallXLogFileSegmentActive before
entering into the archive mode. Hence I tweaked the code introduced by
the following commit a bit, the result v1 patch is attached herewith.
Please review it.

commit cc2c7d65fc27e877c9f407587b0b92d46cd6dd16
Author: Noah Misch <noah@leadboat.com>
Date:   Mon Jun 28 18:34:56 2021 -0700

    Skip WAL recycling and preallocation during archive recovery.

[1] https://www.postgresql.org/message-id/CALj2ACUYz1z6QPduGn5gguCkfd-ko44j4hKcOMtp6fzv9xEWgw%40mail.gmail.com
[2]
2022-08-11 05:10:29.051 UTC [1487086] FATAL:  terminating walreceiver
due to timeout
cp: cannot stat '/home/ubuntu/archived_wal/00000002.history': No such
file or directory
2022-08-11 05:10:29.054 UTC [1487075] LOG:  switched WAL source from
stream to archive after failure
2022-08-11 05:10:29.067 UTC [1487075] LOG:  restored log file
"000000010000000000000034" from archive
TRAP: FailedAssertion("!IsInstallXLogFileSegmentActive()", File:
"xlogrecovery.c", Line: 4149, PID: 1487075)
postgres: startup waiting for
000000010000000000000034(ExceptionalCondition+0xd6)[0x559d20e6deb1]
postgres: startup waiting for
000000010000000000000034(+0x1e4e14)[0x559d20813e14]
postgres: startup waiting for
000000010000000000000034(+0x1e50c0)[0x559d208140c0]
postgres: startup waiting for
000000010000000000000034(+0x1e4354)[0x559d20813354]
postgres: startup waiting for
000000010000000000000034(+0x1e3946)[0x559d20812946]
postgres: startup waiting for
000000010000000000000034(+0x1dbc58)[0x559d2080ac58]
postgres: startup waiting for
000000010000000000000034(+0x1db3ce)[0x559d2080a3ce]
postgres: startup waiting for
000000010000000000000034(XLogReadAhead+0x3d)[0x559d2080aa23]
postgres: startup waiting for
000000010000000000000034(+0x1d9713)[0x559d20808713]
postgres: startup waiting for
000000010000000000000034(+0x1d909c)[0x559d2080809c]
postgres: startup waiting for
000000010000000000000034(XLogPrefetcherReadRecord+0xf9)[0x559d208092a2]
postgres: startup waiting for
000000010000000000000034(+0x1e3427)[0x559d20812427]
postgres: startup waiting for
000000010000000000000034(PerformWalRecovery+0x465)[0x559d2080fd0f]
postgres: startup waiting for
000000010000000000000034(StartupXLOG+0x987)[0x559d207fc732]
postgres: startup waiting for
000000010000000000000034(StartupProcessMain+0xf8)[0x559d20bc3990]
postgres: startup waiting for
000000010000000000000034(AuxiliaryProcessMain+0x1ae)[0x559d20bb62cf]
postgres: startup waiting for
000000010000000000000034(+0x593020)[0x559d20bc2020]
postgres: startup waiting for
000000010000000000000034(+0x591ebb)[0x559d20bc0ebb]
postgres: startup waiting for
000000010000000000000034(+0x590907)[0x559d20bbf907]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7fe518b26520]
/lib/x86_64-linux-gnu/libc.so.6(__select+0xbd)[0x7fe518bff74d]
postgres: startup waiting for
000000010000000000000034(+0x58e1bd)[0x559d20bbd1bd]
postgres: startup waiting for
000000010000000000000034(PostmasterMain+0x14cc)[0x559d20bbcaa2]
postgres: startup waiting for
000000010000000000000034(+0x44a752)[0x559d20a79752]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7fe518b0dd90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7fe518b0de40]
postgres: startup waiting for
000000010000000000000034(_start+0x25)[0x559d206fe5b5]
2022-08-11 05:10:29.161 UTC [1486413] LOG:  startup process (PID
1487075) was terminated by signal 6: Aborted
2022-08-11 05:10:29.161 UTC [1486413] LOG:  terminating any other
active server processes
2022-08-11 05:10:29.169 UTC [1486413] LOG:  shutting down due to
startup process failure
2022-08-11 05:10:29.181 UTC [1486413] LOG:  database system is shut down

2022-08-11 05:34:54.234 UTC [1509079] LOG:  switched WAL source from
stream to archive after failure
2022-08-11 05:34:54.234 UTC [1509079] LOG:  current WAL source is
archive and last WAL source is stream, standby mode 1
2022-08-11 05:34:54.248 UTC [1509079] LOG:  restored log file
"000000010000000000000003" from archive
TRAP: FailedAssertion("!IsInstallXLogFileSegmentActive()", File:
"xlogrecovery.c", Line: 4155, PID: 1509079)
postgres: startup waiting for
000000010000000000000003(ExceptionalCondition+0xd6)[0x55e9cb1cbf2a]
postgres: startup waiting for
000000010000000000000003(+0x1e4e8d)[0x55e9cab71e8d]
postgres: startup waiting for
000000010000000000000003(+0x1e5139)[0x55e9cab72139]
postgres: startup waiting for
000000010000000000000003(+0x1e43cd)[0x55e9cab713cd]
postgres: startup waiting for
000000010000000000000003(+0x1e3946)[0x55e9cab70946]
postgres: startup waiting for
000000010000000000000003(+0x1dbc58)[0x55e9cab68c58]
postgres: startup waiting for
000000010000000000000003(+0x1db0b7)[0x55e9cab680b7]
postgres: startup waiting for
000000010000000000000003(XLogReadAhead+0x3d)[0x55e9cab68a23]
postgres: startup waiting for
000000010000000000000003(+0x1d9713)[0x55e9cab66713]
postgres: startup waiting for
000000010000000000000003(+0x1d909c)[0x55e9cab6609c]
postgres: startup waiting for
000000010000000000000003(XLogPrefetcherReadRecord+0xf9)[0x55e9cab672a2]
postgres: startup waiting for
000000010000000000000003(+0x1e3427)[0x55e9cab70427]
postgres: startup waiting for
000000010000000000000003(+0x1e4911)[0x55e9cab71911]
postgres: startup waiting for
000000010000000000000003(InitWalRecovery+0xaf4)[0x55e9cab6bc4e]
postgres: startup waiting for
000000010000000000000003(StartupXLOG+0x4af)[0x55e9cab5a25a]
postgres: startup waiting for
000000010000000000000003(StartupProcessMain+0xf8)[0x55e9caf21a09]
postgres: startup waiting for
000000010000000000000003(AuxiliaryProcessMain+0x1ae)[0x55e9caf14348]
postgres: startup waiting for
000000010000000000000003(+0x593099)[0x55e9caf20099]
postgres: startup waiting for
000000010000000000000003(PostmasterMain+0x1476)[0x55e9caf1aac5]
postgres: startup waiting for
000000010000000000000003(+0x44a7cb)[0x55e9cadd77cb]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f754c287d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f754c287e40]
postgres: startup waiting for
000000010000000000000003(_start+0x25)[0x55e9caa5c5b5]
2022-08-11 05:34:54.336 UTC [1509076] LOG:  startup process (PID
1509079) was terminated by signal 6: Aborted
2022-08-11 05:34:54.336 UTC [1509076] LOG:  aborting startup due to
startup process failure
2022-08-11 05:34:54.337 UTC [1509076] LOG:  database system is shut down

-- 
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/

Attachment

v1-0001-Avoid-race-condition-in-resetting-XLogCtl-Install.patch

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Bharath Rupireddy

Date:

15 August 2022, 06:00:49

On Thu, Aug 11, 2022 at 10:06 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Today I encountered the assertion failure [2] twice while working on
> another patch [1]. The pattern seems to be that the walreceiver got
> killed or crashed and set it's state to WALRCV_STOPPING or
> WALRCV_STOPPED by the team the WAL state machine moves to archive and
> hence the following XLogShutdownWalRcv() code will not get hit:
>
>                     /*
>                      * Before we leave XLOG_FROM_STREAM state, make sure that
>                      * walreceiver is not active, so that it won't overwrite
>                      * WAL that we restore from archive.
>                      */
>                     if (WalRcvStreaming())
>                         ShutdownWalRcv();
>
> I agree with Kyotaro-san to reset InstallXLogFileSegmentActive before
> entering into the archive mode. Hence I tweaked the code introduced by
> the following commit a bit, the result v1 patch is attached herewith.
> Please review it.

I added it to the current commitfest to not lose track of it:
https://commitfest.postgresql.org/39/3814/.

-- 
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Bharath Rupireddy

Date:

10 September 2022, 02:22:01

On Mon, Aug 15, 2022 at 11:30 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Thu, Aug 11, 2022 at 10:06 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > Today I encountered the assertion failure [2] twice while working on
> > another patch [1]. The pattern seems to be that the walreceiver got
> > killed or crashed and set it's state to WALRCV_STOPPING or
> > WALRCV_STOPPED by the team the WAL state machine moves to archive and
> > hence the following XLogShutdownWalRcv() code will not get hit:
> >
> >                     /*
> >                      * Before we leave XLOG_FROM_STREAM state, make sure that
> >                      * walreceiver is not active, so that it won't overwrite
> >                      * WAL that we restore from archive.
> >                      */
> >                     if (WalRcvStreaming())
> >                         ShutdownWalRcv();
> >
> > I agree with Kyotaro-san to reset InstallXLogFileSegmentActive before
> > entering into the archive mode. Hence I tweaked the code introduced by
> > the following commit a bit, the result v1 patch is attached herewith.
> > Please review it.
>
> I added it to the current commitfest to not lose track of it:
> https://commitfest.postgresql.org/39/3814/.

Today, I spent some more time on this issue, I modified the v1 patch
posted upthread a bit - now resetting the InstallXLogFileSegmentActive
only when the WAL source switched to archive, not every time in
archive mode.

I'm attaching v2 patch here with, please review it further.

Just for the records - there's another report of the assertion failure
at [1], many thanks to Kyotaro-san for providing consistent
reproducible steps.

[1] - https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v2-0001-Avoid-race-condition-in-resetting-XLogCtl-Install.patch

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Michael Paquier

Date:

12 September 2022, 07:00:29

On Sat, Sep 10, 2022 at 07:52:01AM +0530, Bharath Rupireddy wrote:
> Today, I spent some more time on this issue, I modified the v1 patch
> posted upthread a bit - now resetting the InstallXLogFileSegmentActive
> only when the WAL source switched to archive, not every time in
> archive mode.
>
> I'm attaching v2 patch here with, please review it further.
>
> Just for the records - there's another report of the assertion failure
> at [1], many thanks to Kyotaro-san for providing consistent
> reproducible steps.
>
> [1] - https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com

Well, the fact that cc2c7d6 is involved here makes this thread an open
item for PG15 as far as I can see, assigned to Noah (added now in
CC).

While reading your last patch, I have found rather confusing that we
only reset InstallXLogFileSegmentActive when the current source is the
archives and it does not match the old source.  This code is already
complicated, and I don't think that having more assumptions in its
internals is a good thing when it comes to its long-term maintenance.
In short, HEAD is rather conservative when it comes to set
InstallXLogFileSegmentActive, switching it only when we request
streaming with RequestXLogStreaming(), but too aggressive when it
comes to reset it and we want something in the middle ground.  FWIW, I
find better the approach taken by Horiguchi-san in [1] to reset the
state before we attempt to read WAL from the archives *or* pg_wal,
after we know that the last source has failed, where we know that we
are not streaming yet (but recovery may be requested soon).

Side note: I don't see much the point of having two routines named
SetInstallXLogFileSegmentActive and ResetInstallXLogFileSegmentActive
that do the opposite thing.  We could just have one.

[1]: https://www.postgresql.org/message-id/20220214.171428.735280610520122187.horikyota.ntt@gmail.com
--
Michael

Attachment

signature.asc

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Bharath Rupireddy

Date:

12 September 2022, 12:28:08

On Mon, Sep 12, 2022 at 12:30 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sat, Sep 10, 2022 at 07:52:01AM +0530, Bharath Rupireddy wrote:
> > Today, I spent some more time on this issue, I modified the v1 patch
> > posted upthread a bit - now resetting the InstallXLogFileSegmentActive
> > only when the WAL source switched to archive, not every time in
> > archive mode.
> >
> > I'm attaching v2 patch here with, please review it further.
> >
> > Just for the records - there's another report of the assertion failure
> > at [1], many thanks to Kyotaro-san for providing consistent
> > reproducible steps.
> >
> > [1] - https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com
>
> Well, the fact that cc2c7d6 is involved here makes this thread an open
> item for PG15 as far as I can see, assigned to Noah (added now in
> CC).

Thanks.

> While reading your last patch, I have found rather confusing that we
> only reset InstallXLogFileSegmentActive when the current source is the
> archives and it does not match the old source.  This code is already
> complicated, and I don't think that having more assumptions in its
> internals is a good thing when it comes to its long-term maintenance.
> In short, HEAD is rather conservative when it comes to set
> InstallXLogFileSegmentActive, switching it only when we request
> streaming with RequestXLogStreaming(), but too aggressive when it
> comes to reset it and we want something in the middle ground.  FWIW, I
> find better the approach taken by Horiguchi-san in [1] to reset the
> state before we attempt to read WAL from the archives *or* pg_wal,
> after we know that the last source has failed, where we know that we
> are not streaming yet (but recovery may be requested soon).
>
> [1]: https://www.postgresql.org/message-id/20220214.171428.735280610520122187.horikyota.ntt@gmail.com

I think the fix at [1] is wrong. It basically resets
InstallXLogFileSegmentActive for both XLOG_FROM_ARCHIVE and
XLOG_FROM_PG_WAL, remember that we don't need WAL files preallocation
and recycling just for archive. Moreover, if we were to reset just for
archive there, it's too aggressive, meaning for every WAL record, we
take ControlFileLock in exclusive mode to reset it.

IMO, doing it once when we switch the source to archive is the correct
way because it avoids frequent ControlFileLock acquisitions and makes
the fix more intuitive. And we have a comment saying why we reset
InstallXLogFileSegmentActive, if required we can point to the commit
cc2c7d6 in the comments.

> Side note: I don't see much the point of having two routines named
> SetInstallXLogFileSegmentActive and ResetInstallXLogFileSegmentActive
> that do the opposite thing.  We could just have one.

It's an implementation choice. However, I changed it in the v3 patch.

Please review the attached v3 patch further.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v3-0001-Avoid-race-condition-in-resetting-XLogCtl-Install.patch

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Noah Misch

Date:

13 September 2022, 03:22:56

On Mon, Sep 12, 2022 at 05:58:08PM +0530, Bharath Rupireddy wrote:
> On Mon, Sep 12, 2022 at 12:30 PM Michael Paquier <michael@paquier.xyz> wrote:
> > On Sat, Sep 10, 2022 at 07:52:01AM +0530, Bharath Rupireddy wrote:
> > > Just for the records - there's another report of the assertion failure
> > > at [1], many thanks to Kyotaro-san for providing consistent
> > > reproducible steps.
> > >
> > > [1] - https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com

I plan to use that message's patch, because it guarantees WALRCV_STOPPED at
the code location being changed.  Today, in the unlikely event of
!WalRcvStreaming() due to WALRCV_WAITING or WALRCV_STOPPING, that code
proceeds without waiting for WALRCV_STOPPED.  I think WALRCV_STOPPING can't
happen there.  Even if WALRCV_WAITING also can't happen, it's a simplicity win
to remove the need to think about that.  Dilip Kumar and Nathan Bossart also
stated support for that design.

If WALRCV_WAITING or WALRCV_STOPPING can happen at that patch's code site, I
perhaps should back-patch the change to released versions.  Does anyone know
whether one or both can happen?

> > FWIW, I
> > find better the approach taken by Horiguchi-san in [1] to reset the
> > state before we attempt to read WAL from the archives *or* pg_wal,
> > after we know that the last source has failed, where we know that we
> > are not streaming yet (but recovery may be requested soon).
> >
> > [1]: https://www.postgresql.org/message-id/20220214.171428.735280610520122187.horikyota.ntt@gmail.com
> 
> I think the fix at [1] is wrong. It basically resets
> InstallXLogFileSegmentActive for both XLOG_FROM_ARCHIVE and
> XLOG_FROM_PG_WAL, remember that we don't need WAL files preallocation
> and recycling just for archive. Moreover, if we were to reset just for
> archive there, it's too aggressive, meaning for every WAL record, we
> take ControlFileLock in exclusive mode to reset it.
> 
> IMO, doing it once when we switch the source to archive is the correct
> way because it avoids frequent ControlFileLock acquisitions and makes
> the fix more intuitive. And we have a comment saying why we reset
> InstallXLogFileSegmentActive, if required we can point to the commit
> cc2c7d6 in the comments.

The startup process maintains the invariant that archive recovery and
streaming recovery don't overlap in time; one stops before the other starts.
As long as it achieves that, I'm not aware of a cause to care whether the flag
reset happens when streaming ends vs. when archive recovery begins.  If the
startup process develops a defect such that it ceases to maintain that
invariant, "when archive recovery begins" does have the
advantage-or-disadvantage of causing a failure in non-assert builds.  The
other way gets only an assertion failure.  Michael Paquier or Bharath
Rupireddy, did you envision benefits other than that one?

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Michael Paquier

Date:

13 September 2022, 04:10:36

On Mon, Sep 12, 2022 at 08:22:56PM -0700, Noah Misch wrote:
> I plan to use that message's patch, because it guarantees WALRCV_STOPPED at
> the code location being changed.  Today, in the unlikely event of
> !WalRcvStreaming() due to WALRCV_WAITING or WALRCV_STOPPING, that code
> proceeds without waiting for WALRCV_STOPPED.  I think WALRCV_STOPPING can't
> happen there.  Even if WALRCV_WAITING also can't happen, it's a simplicity win
> to remove the need to think about that.  Dilip Kumar and Nathan Bossart also
> stated support for that design.

I did not notice this one.  Sounds pretty much right.

> If WALRCV_WAITING or WALRCV_STOPPING can happen at that patch's code site, I
> perhaps should back-patch the change to released versions.  Does anyone know
> whether one or both can happen?

Hmm.  I agree that this could be a good thing in the long-term.

> The startup process maintains the invariant that archive recovery and
> streaming recovery don't overlap in time; one stops before the other starts.
> As long as it achieves that, I'm not aware of a cause to care whether the flag
> reset happens when streaming ends vs. when archive recovery begins.  If the
> startup process develops a defect such that it ceases to maintain that
> invariant, "when archive recovery begins" does have the
> advantage-or-disadvantage of causing a failure in non-assert builds.  The
> other way gets only an assertion failure.  Michael Paquier or Bharath
> Rupireddy, did you envision benefits other than that one?

I think that you are forgetting one case here though: a standby where
standby.signal is set without restore_command and with
primary_conninfo.  It could move on with recovery even if the WAL
receiver is unstable when something external pushes more WAL segments
to the standby's pg_wal/.  So my point of upthread is that we should
make sure that standby.signal+primary_conninfo resets the flag state
when the WAL receiver stops streaming in this case as well.

The proposed v1-0001-Do-not-skip-calling-XLogShutdownWalRcv.patch
would achieve that, because it does not change the state of
InstallXLogFileSegmentActive depending on the source being
XLOG_FROM_ARCHIVE.  So I'm fine with what you want to do.
--
Michael

Attachment

signature.asc

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Bharath Rupireddy

Date:

13 September 2022, 06:26:16

On Tue, Sep 13, 2022 at 8:52 AM Noah Misch <noah@leadboat.com> wrote:
>
> > > > [1] - https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com
>
> I plan to use that message's patch, because it guarantees WALRCV_STOPPED at
> the code location being changed.  Today, in the unlikely event of
> !WalRcvStreaming() due to WALRCV_WAITING or WALRCV_STOPPING, that code
> proceeds without waiting for WALRCV_STOPPED.

Hm. That was the original fix [2] proposed and it works. The concern
is that XLogShutdownWalRcv() does a bunch of work via ShutdownWalRcv()
- it calls ConditionVariablePrepareToSleep(),
ConditionVariableCancelSleep() (has lock 2 acquisitions and
requisitions) and 1 function call WalRcvRunning()) even for
WALRCV_STOPPED case, all this is unnecessary IMO when we determine the
walreceiver is state is already WALRCV_STOPPED. I think we can do
something like [3] to allay this concern and go with the fix proposed
at [1] unconditionally calling XLogShutdownWalRcv().

> If WALRCV_WAITING or WALRCV_STOPPING can happen at that patch's code site, I
> perhaps should back-patch the change to released versions.  Does anyone know
> whether one or both can happen?

IMO, we must back-patch to the version where
cc2c7d65fc27e877c9f407587b0b92d46cd6dd16  got introduced irrespective
of any of the above happening.

Thoughts?

[1] - https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com
[2] -
https://www.postgresql.org/message-id/flat/CALj2ACUoBWbaFo_t0gew%2Bx6n0V%2BmpvB_23HLvsVD9abgCShV5A%40mail.gmail.com#e762ee94cbbe32a0b8c72c60793626b3
[3] -diff --git a/src/backend/replication/walreceiverfuncs.c
b/src/backend/replication/walreceiverfuncs.c
index 90798b9d53..3487793c7a 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -181,6 +181,7 @@ ShutdownWalRcv(void)
        WalRcvData *walrcv = WalRcv;
        pid_t           walrcvpid = 0;
        bool            stopped = false;
+       bool            was_stopped = false;

        /*
         * Request walreceiver to stop. Walreceiver will switch to
WALRCV_STOPPED
@@ -191,6 +192,7 @@ ShutdownWalRcv(void)
        switch (walrcv->walRcvState)
        {
                case WALRCV_STOPPED:
+                       was_stopped = true;
                        break;
                case WALRCV_STARTING:
                        walrcv->walRcvState = WALRCV_STOPPED;
@@ -208,6 +210,10 @@ ShutdownWalRcv(void)
        }
        SpinLockRelease(&walrcv->mutex);

+       /* Quick exit if walreceiver was already stopped. */
+       if (was_stopped)
+               return;
+
        /* Unnecessary but consistent. */
        if (stopped)
                ConditionVariableBroadcast(&walrcv->walRcvStoppedCV);

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Kyotaro Horiguchi

Date:

13 September 2022, 10:26:29

At Tue, 13 Sep 2022 11:56:16 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in 
> On Tue, Sep 13, 2022 at 8:52 AM Noah Misch <noah@leadboat.com> wrote:
> >
> > > > > [1] -
https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com
> >
> > I plan to use that message's patch, because it guarantees WALRCV_STOPPED at
> > the code location being changed.  Today, in the unlikely event of
> > !WalRcvStreaming() due to WALRCV_WAITING or WALRCV_STOPPING, that code
> > proceeds without waiting for WALRCV_STOPPED.
> 
> Hm. That was the original fix [2] proposed and it works. The concern

(Mmm. sorry for omitting that.)

> is that XLogShutdownWalRcv() does a bunch of work via ShutdownWalRcv()
> - it calls ConditionVariablePrepareToSleep(),

Anyway the code path is executed in almost all cases because the same
assertion fires otherwise. So I don't see a problem if we do the bunch
of synchronization things also in that rare case.  I'm not sure we
want to do [3].

> > If WALRCV_WAITING or WALRCV_STOPPING can happen at that patch's code site, I
> > perhaps should back-patch the change to released versions.  Does anyone know
> > whether one or both can happen?
> 
> IMO, we must back-patch to the version where
> cc2c7d65fc27e877c9f407587b0b92d46cd6dd16  got introduced irrespective
> of any of the above happening.

That is, PG15? I agree to that.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Bharath Rupireddy

Date:

15 September 2022, 10:25:54

On Tue, Sep 13, 2022 at 3:56 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> > is that XLogShutdownWalRcv() does a bunch of work via ShutdownWalRcv()
> > - it calls ConditionVariablePrepareToSleep(),
>
> Anyway the code path is executed in almost all cases because the same
> assertion fires otherwise. So I don't see a problem if we do the bunch
> of synchronization things also in that rare case.  I'm not sure we
> want to do [3].

IMO, we don't need to let ShutdownWalRcv() to do extra work of
ConditionVariablePrepareToSleep() - WalRcvRunning() -
ConditionVariableCancelSleep() for WALRCV_STOPPED cases when we know
the walreceiver status before - even though it doesn't have any
problems per se and at that place in the code the WALRCV_STOPPED cases
are more frequent than any other walreceiver cases.

Having said that, let's also hear from other hackers.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Noah Misch

Date:

15 September 2022, 13:58:43

On Tue, Sep 13, 2022 at 11:56:16AM +0530, Bharath Rupireddy wrote:
> On Tue, Sep 13, 2022 at 8:52 AM Noah Misch <noah@leadboat.com> wrote:
> > > > > [1] -
https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com
> >
> > I plan to use that message's patch, because it guarantees WALRCV_STOPPED at
> > the code location being changed.  Today, in the unlikely event of
> > !WalRcvStreaming() due to WALRCV_WAITING or WALRCV_STOPPING, that code
> > proceeds without waiting for WALRCV_STOPPED.

Pushed that way.

> Hm. That was the original fix [2] proposed and it works. The concern
> is that XLogShutdownWalRcv() does a bunch of work via ShutdownWalRcv()
> - it calls ConditionVariablePrepareToSleep(),
> ConditionVariableCancelSleep() (has lock 2 acquisitions and
> requisitions) and 1 function call WalRcvRunning()) even for
> WALRCV_STOPPED case, all this is unnecessary IMO when we determine the
> walreceiver is state is already WALRCV_STOPPED.

That's fine.  If we're reaching this code at high frequency, that implies
we're also forking walreceiver processes at high frequency.  This code would
be a trivial part of the overall cost.

> > If WALRCV_WAITING or WALRCV_STOPPING can happen at that patch's code site, I
> > perhaps should back-patch the change to released versions.  Does anyone know
> > whether one or both can happen?

If anyone discovers such cases later, we can extend the back-patch then.

> IMO, we must back-patch to the version where
> cc2c7d65fc27e877c9f407587b0b92d46cd6dd16  got introduced irrespective
> of any of the above happening.

Correct.  The sentences were about *released* versions, not v15.

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Michael Paquier

Date:

16 September 2022, 00:05:46

On Thu, Sep 15, 2022 at 06:58:43AM -0700, Noah Misch wrote:
> Pushed that way.

Thanks for taking care of it, Noah.
--
Michael

Attachment

signature.asc

Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From

Kyotaro Horiguchi

Date:

16 September 2022, 01:45:54

At Fri, 16 Sep 2022 09:05:46 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> On Thu, Sep 15, 2022 at 06:58:43AM -0700, Noah Misch wrote:
> > Pushed that way.
> 
> Thanks for taking care of it, Noah.

+1. Thanks.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center