At Fri, 11 Feb 2022 22:25:49 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in
> > I don't think
> > > just making InstallXLogFileSegmentActive false is enough. By looking
> > > at the comment [1], it doesn't make sense to move ahead for restoring
> > > from the archive location without the WAL receiver fully stopped.
> > > IMO, the real fix is to just remove WalRcvStreaming() and call
> > > XLogShutdownWalRcv() unconditionally. Anyways, we have the
> > > Assert(!WalRcvStreaming()); down below. I don't think it will create
> > > any problem.
> >
> > If WalRcvStreaming() is returning false that means walreceiver is
> > already stopped so we don't need to shutdown it externally. I think
> > like we are setting this flag outside start streaming we can reset it
> > also outside XLogShutdownWalRcv. Or I am fine even if we call
> > XLogShutdownWalRcv() because if walreceiver is stopped it will just
> > reset the flag we want it to reset and it will do nothing else.
>
> As I said, I'm okay with just calling XLogShutdownWalRcv()
> unconditionally as it just returns in case walreceiver has already
> stopped and updates the InstallXLogFileSegmentActive flag to false.
>
> Let's also hear what other hackers have to say about this.
Firstly, good catch:) And the direction seems right.
It seems like an overlook of cc2c7d65fc. We cannot install new wal
segments only while we're in archive recovery. Conversely, we must
turn off it when entering archive recovery (not exiting streaming
recovery). So, *I* feel like to do that at the beginning of
XLOG_FROM_ARCHIVE/PG_WAL rather than the end of XLOG_FROM_STREAM.
(And I would like to remove XLogShutDownWalRcv() and turn off the flag
in StartupXLOG explicitly, but it would be overdone.)
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -12800,6 +12800,16 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
*/
Assert(!WalRcvStreaming());
+ /*
+ * WAL segment installation conflicts with archive
+ * recovery. Make sure it is turned off. XLogShutdownWalRcv()
+ * does that but it is not done when the process has voluntary
+ * exited for example for replication timeout.
+ */
+ LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+ XLogCtl->InstallXLogFileSegmentActive = false;
+ LWLockRelease(ControlFileLock);
+
/* Close any old file we might have open. */
if (readFile >= 0)
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center