Walsender may fail to send wal to the end. - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Walsender may fail to send wal to the end.
Date
Msg-id 20210326.182014.298226099985413968.horikyota.ntt@gmail.com
Whole thread Raw
Responses Re: Walsender may fail to send wal to the end.  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Hello, I happened to see a doubious behavior of walsender.

On a replication set with wal_keep_size/(segments) = 0, running the
following command on the primary causes walsender to fail to send up
to the final shutdown checkpoint record to the standby.

(create table t in advance)

psql -c 'insert into t values(0); select pg_switch_wal();'; pg_ctl stop

The primary complains like this:

2021-03-26 17:59:29.324 JST [checkpointer][140697] LOG:  shutting down
2021-03-26 17:59:29.387 JST [walsender][140816] ERROR:  requested WAL segment 000000010000000000000032 has already been
removed
2021-03-26 17:59:29.387 JST [walsender][140816] STATEMENT:  START_REPLICATION 0/32000000 TIMELINE 1
2021-03-26 17:59:29.394 JST [postmaster][140695] LOG:  database system is shut down

This is because XLogSendPhysical detects removal of the wal segment
currently reading by shutdown checkpoint.  However, there' no fear of
overwriting of WAL segments at the time.

So I think we can omit the call to CheckXLogRemoved() while
MyWalSnd->state is WALSNDSTTE_STOPPING because the state comes after
the shutdown checkpoint completes.

Of course that doesn't help if walsender was running two segments
behind. There still could be a small window for the failure.  But it's
a great help to save the case of just 1 segment behind.

Is it worth fixing?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 23baa4498a..4b1e0cf9c5 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -2755,9 +2755,18 @@ retry:
                  &errinfo))
         WALReadRaiseError(&errinfo);
 
-    /* See logical_read_xlog_page(). */
-    XLByteToSeg(startptr, segno, xlogreader->segcxt.ws_segsize);
-    CheckXLogRemoved(segno, xlogreader->seg.ws_tli);
+    /*
+     * See logical_read_xlog_page().  However, there's a case where we're
+     * reading the segment which is removed/recycled by shutdown checkpoint.
+     * We continue to read it in that case because 1) It's safe because no wal
+     * activity happens after shutdown checkpoint completes, 2) We need to do
+     * our best to send WAL up to the shutdown checkpoint record.
+     */
+    if (MyWalSnd->state < WALSNDSTATE_STOPPING)
+    {
+        XLByteToSeg(startptr, segno, xlogreader->segcxt.ws_segsize);
+        CheckXLogRemoved(segno, xlogreader->seg.ws_tli);
+    }
 
     /*
      * During recovery, the currently-open WAL file might be replaced with the

pgsql-hackers by date:

Previous
From: Markus Wanner
Date:
Subject: Re: [PATCH] add concurrent_abort callback for output plugin
Next
From: Denis Hirn
Date:
Subject: Re: [PATCH] Allow multiple recursive self-references