Re: Make mesage at end-of-recovery less scary. - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Make mesage at end-of-recovery less scary.
Date
Msg-id 20220209.164414.1807834796155134287.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Make mesage at end-of-recovery less scary.  (Ashutosh Sharma <ashu.coek88@gmail.com>)
Responses Re: Make mesage at end-of-recovery less scary.  (Ashutosh Sharma <ashu.coek88@gmail.com>)
List pgsql-hackers
Hi, Ashutosh.

At Tue, 8 Feb 2022 18:35:34 +0530, Ashutosh Sharma <ashu.coek88@gmail.com> wrote in 
> Here are some of my review comments on the v11 patch:

Thank you for taking a look on this.

> -                       (errmsg_internal("reached end of WAL in
> pg_wal, entering archive recovery")));
> +                       (errmsg_internal("reached end of WAL at %X/%X
> on timeline %u in %s during crash recovery, entering archive
> recovery",
> +                                        LSN_FORMAT_ARGS(ErrRecPtr),
> +                                        replayTLI,
> +                                        xlogSourceNames[currentSource])));
> 
> Why crash recovery? Won't this message get printed even during PITR?

It is in the if-block with the following condition.

>     * If archive recovery was requested, but we were still doing
>     * crash recovery, switch to archive recovery and retry using the
>     * offline archive. We have now replayed all the valid WAL in
>     * pg_wal, so we are presumably now consistent.
...
>    if (!InArchiveRecovery && ArchiveRecoveryRequested)

This means archive-recovery is requested but not started yet. That is,
we've just finished crash recovery.  The existing comment cited
together is mentioning that.

At the end of PITR (or archive recovery), the other code works.

> /*
>  * If we haven't emit an error message, we have safely reached the
>  * end-of-WAL.
>  */
> if (emode_for_corrupt_record(LOG, ErrRecPtr) == LOG)
> {
>     char       *fmt;
> 
>     if (StandbyMode)
>         fmt = gettext_noop("reached end of WAL at %X/%X on timeline %u in %s during standby mode");
>     else if (InArchiveRecovery)
>         fmt = gettext_noop("reached end of WAL at %X/%X on timeline %u in %s during archive recovery");
>     else
>         fmt = gettext_noop("reached end of WAL at %X/%X on timeline %u in %s during crash recovery");

The last among the above messages is choosed when archive-recovery is
not requested at all.

> I just did a PITR and could see these messages in the logfile.

Yeah, the log lines are describing that the server starting with crash
recovery to run PITR.

> 2022-02-08 18:00:44.367 IST [86185] LOG:  starting point-in-time
> recovery to WAL location (LSN) "0/5227790"
> 2022-02-08 18:00:44.368 IST [86185] LOG:  database system was not
> properly shut down; automatic recovery in progress

Well. I guess that the "automatic recovery" is ambiguous.  Does it
make sense if the second line were like the follows instead?

+ 2022-02-08 18:00:44.368 IST [86185] LOG:  database system was not properly shut down; crash recovery in progress

> 2022-02-08 18:00:44.369 IST [86185] LOG:  redo starts at 0/14DC8D8
> 2022-02-08 18:00:44.978 IST [86185] DEBUG1:  reached end of WAL at
> 0/3FFFFD0 on timeline 1 in pg_wal during crash recovery, entering
> archive recovery

(I don't include this change in this patch since it would be another
issue.)

> ==
> 
> +           /*
> +            * If we haven't emit an error message, we have safely reached the
> +            * end-of-WAL.
> +            */
> +           if (emode_for_corrupt_record(LOG, ErrRecPtr) == LOG)
> +           {
> +               char       *fmt;
> +
> +               if (StandbyMode)
> +                   fmt = gettext_noop("reached end of WAL at %X/%X on
> timeline %u in %s during standby mode");
> +               else if (InArchiveRecovery)
> +                   fmt = gettext_noop("reached end of WAL at %X/%X on
> timeline %u in %s during archive recovery");
> +               else
> +                   fmt = gettext_noop("reached end of WAL at %X/%X on
> timeline %u in %s during crash recovery");
> +
> +               ereport(LOG,
> +                       (errmsg(fmt, LSN_FORMAT_ARGS(ErrRecPtr), replayTLI,
> +                               xlogSourceNames[currentSource]),
> +                        (errormsg ? errdetail_internal("%s", errormsg) : 0)));
> +           }
> 
> Doesn't it make sense to add an assert statement inside this if-block
> that will check for xlogreader->EndOfWAL?

Good point.  On second thought, the condition there is flat wrong.
The message is "reached end of WAL" so the condition should be
EndOfWAL.  On the other hand we didn't make sure that the error
message for the stop is emitted anywhere.  Thus I don't particularly
want to be strict on that point.

I made the following change for this.

-            if (emode_for_corrupt_record(LOG, ErrRecPtr) == LOG)
+            if (xlogreader->EndOfWAL)



> ==
> 
> -            * We only end up here without a message when XLogPageRead()
> -            * failed - in that case we already logged something. In
> -            * StandbyMode that only happens if we have been triggered, so we
> -            * shouldn't loop anymore in that case.
> +            * If we get here for other than end-of-wal, emit the error
> +            * message right now. Otherwise the message if any is shown as a
> +            * part of the end-of-WAL message below.
>              */
> 
> For consistency, I think we can replace "end-of-wal" with
> "end-of-WAL". Please note that everywhere else in the comments you
> have used "end-of-WAL". So why not the same here?

Right.  Fixed.

> ==
> 
>                             ereport(LOG,
> -                                   (errmsg("replication terminated by
> primary server"),
> -                                    errdetail("End of WAL reached on
> timeline %u at %X/%X.",
> -                                              startpointTLI,
> -
> LSN_FORMAT_ARGS(LogstreamResult.Write))));
> +                                   (errmsg("replication terminated by
> primary server on timeline %u at %X/%X.",
> +                                           startpointTLI,
> +
> LSN_FORMAT_ARGS(LogstreamResult.Write))));
> 
> Is this change really required? I don't see any issue with the
> existing error message.

Without the change, we see two similar end-of-WAL messages from both
walreceiver and startup.  (Please don't care about the slight
difference of LSNs..)

[walreceiver] LOG:  replication terminated by primary server
[walreceiver] DETAIL:  End of WAL reached on timeline 1 at 0/B0000D8.
[startup] LOG:  reached end of WAL at 0/B000060 on timeline 1 in archive during standby mode
[startup] DETAIL:  empty record found at 0/B0000D8

But what the walreceiver detected at the time is not End-of-WAL but an
error on the streaming connection.  Since this patch makes startup
process to detect End-of-WAL, we don't need the duplicate and
in-a-sense false end-of-WAL message from walreceiver.

# By the way, I deliberately choosed to report the LSN of last
# successfully record in the "reached end of WAL" message. On second
# thought about this choice, I came to think that it is better to report
# the failure LSN.  I changed it to report the failure LSN.  In this
# case we face an ambiguity according to how we failed to read the
# record but for now we have no choice than blindly choosing one of
# them.  I choosed EndRecPtr since I think decode error happens quite
# rarely than read errors.

[walreceiver] LOG:  replication terminated by primary server at 0/B014228 on timeline 1.
[startup] LOG:  reached end of WAL at 0/B014228 on timeline 1 in archive during standby mode
[startup] DETAIL:  empty record found at 0/B014228

This is the reason for the change.


> Lastly, are we also planning to backport this patch?

This is apparent a behavioral change, not a bug fix, which I think we
regard as not appropriate for back-patching.


As the result, I made the following chages in the version 11.

1. Changed the condition for the "end-of-WAL" message from
   emode_for_corrupt_record to the EndOfWAL flag.

2. Corrected the wording of end-of-wal to end-of-WAL.

3. In the "reached end of WAL" message, report the LSN of the
  beginning of failed record instead of the beginning of the
  last-succeeded record.

4. In the changed message in walreceiver.c, I swapped LSN and timeline
  so that they are in the same order with other similar messages.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From e07c1501cd0020f2a817dd9544c4aa5063e29685 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v11] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlog.c         |  93 +++++++++++++-----
 src/backend/access/transam/xlogreader.c   |  78 +++++++++++++++
 src/backend/replication/walreceiver.c     |   7 +-
 src/bin/pg_waldump/pg_waldump.c           |  13 ++-
 src/include/access/xlogreader.h           |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 110 +++++++++++++++++++++-
 6 files changed, 271 insertions(+), 31 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 958220c495..618f33d342 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4480,6 +4480,7 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
     for (;;)
     {
         char       *errormsg;
+        XLogRecPtr    ErrRecPtr = InvalidXLogRecPtr;
 
         record = XLogReadRecord(xlogreader, &errormsg);
         if (record == NULL)
@@ -4495,6 +4496,18 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
             {
                 abortedRecPtr = xlogreader->abortedRecPtr;
                 missingContrecPtr = xlogreader->missingContrecPtr;
+                ErrRecPtr = abortedRecPtr;
+            }
+            else
+            {
+                /*
+                 * EndRecPtr is the LSN we tried to read but failed. In the
+                 * case of decoding error, it is at the end of the failed
+                 * record but we don't have a means for now to know EndRecPtr
+                 * is pointing to which of the beginning or ending of the
+                 * failed record.
+                 */
+                ErrRecPtr = xlogreader->EndRecPtr;
             }
 
             if (readFile >= 0)
@@ -4504,13 +4517,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
             }
 
             /*
-             * We only end up here without a message when XLogPageRead()
-             * failed - in that case we already logged something. In
-             * StandbyMode that only happens if we have been triggered, so we
-             * shouldn't loop anymore in that case.
+             * If we get here for other than end-of-WAL, emit the error
+             * message right now. Otherwise the message if any is shown as a
+             * part of the end-of-WAL message below.
              */
-            if (errormsg)
-                ereport(emode_for_corrupt_record(emode, xlogreader->EndRecPtr),
+            if (!xlogreader->EndOfWAL && errormsg)
+                ereport(emode_for_corrupt_record(emode, ErrRecPtr),
                         (errmsg_internal("%s", errormsg) /* already translated */ ));
         }
 
@@ -4541,11 +4553,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
             /* Great, got a record */
             return record;
         }
-        else
+
+        /* No valid record available from this source */
+        lastSourceFailed = true;
+
+        if (!fetching_ckpt)
         {
-            /* No valid record available from this source */
-            lastSourceFailed = true;
-
             /*
              * If archive recovery was requested, but we were still doing
              * crash recovery, switch to archive recovery and retry using the
@@ -4558,11 +4571,17 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
              * we'd have no idea how far we'd have to replay to reach
              * consistency.  So err on the safe side and give up.
              */
-            if (!InArchiveRecovery && ArchiveRecoveryRequested &&
-                !fetching_ckpt)
+            if (!InArchiveRecovery && ArchiveRecoveryRequested)
             {
+                /*
+                 * We don't report this as LOG, since we don't stop recovery
+                 * here
+                 */
                 ereport(DEBUG1,
-                        (errmsg_internal("reached end of WAL in pg_wal, entering archive recovery")));
+                        (errmsg_internal("reached end of WAL at %X/%X on timeline %u in %s during crash recovery,
enteringarchive recovery",
 
+                                         LSN_FORMAT_ARGS(ErrRecPtr),
+                                         replayTLI,
+                                         xlogSourceNames[currentSource])));
                 InArchiveRecovery = true;
                 if (StandbyModeRequested)
                     StandbyMode = true;
@@ -4610,12 +4629,33 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
                 continue;
             }
 
-            /* In standby mode, loop back to retry. Otherwise, give up. */
-            if (StandbyMode && !CheckForStandbyTrigger())
-                continue;
-            else
-                return NULL;
+            /*
+             * If we haven't emit an error message, we have safely reached the
+             * end-of-WAL.
+             */
+            if (xlogreader->EndOfWAL)
+            {
+                char       *fmt;
+
+                if (StandbyMode)
+                    fmt = gettext_noop("reached end of WAL at %X/%X on timeline %u in %s during standby mode");
+                else if (InArchiveRecovery)
+                    fmt = gettext_noop("reached end of WAL at %X/%X on timeline %u in %s during archive recovery");
+                else
+                    fmt = gettext_noop("reached end of WAL at %X/%X on timeline %u in %s during crash recovery");
+
+                ereport(LOG,
+                        (errmsg(fmt, LSN_FORMAT_ARGS(ErrRecPtr), replayTLI,
+                                xlogSourceNames[currentSource]),
+                         (errormsg ? errdetail_internal("%s", errormsg) : 0)));
+            }
         }
+
+        /* In standby mode, loop back to retry. Otherwise, give up. */
+        if (StandbyMode && !CheckForStandbyTrigger())
+            continue;
+        else
+            return NULL;
     }
 }
 
@@ -7294,7 +7334,7 @@ StartupXLOG(void)
         {
             ereport(LOG,
                     (errmsg("database system was not properly shut down; "
-                            "automatic recovery in progress")));
+                            "crash recovery in progress")));
             if (recoveryTargetTLI > ControlFile->checkPointCopy.ThisTimeLineID)
                 ereport(LOG,
                         (errmsg("crash recovery starts in timeline %u "
@@ -7544,7 +7584,7 @@ StartupXLOG(void)
         else
         {
             /* just have to read next record after CheckPoint */
-            record = ReadRecord(xlogreader, LOG, false, replayTLI);
+            record = ReadRecord(xlogreader, WARNING, false, replayTLI);
         }
 
         if (record != NULL)
@@ -7782,7 +7822,7 @@ StartupXLOG(void)
                 }
 
                 /* Else, try to fetch the next WAL record */
-                record = ReadRecord(xlogreader, LOG, false, replayTLI);
+                record = ReadRecord(xlogreader, WARNING, false, replayTLI);
             } while (record != NULL);
 
             /*
@@ -7842,13 +7882,20 @@ StartupXLOG(void)
 
             InRedo = false;
         }
-        else
+        else if (xlogreader->EndOfWAL)
         {
             /* there are no WAL records following the checkpoint */
             ereport(LOG,
                     (errmsg("redo is not required")));
 
         }
+        else
+        {
+            /* broken record found */
+            ereport(WARNING,
+                    (errmsg("redo is skipped"),
+                     errhint("This suggests WAL file corruption. You might need to check the database.")));
+        }
 
         /*
          * This check is intentionally after the above log messages that
@@ -13097,7 +13144,7 @@ emode_for_corrupt_record(int emode, XLogRecPtr RecPtr)
 {
     static XLogRecPtr lastComplaint = 0;
 
-    if (readSource == XLOG_FROM_PG_WAL && emode == LOG)
+    if (currentSource == XLOG_FROM_PG_WAL && emode <= WARNING)
     {
         if (RecPtr == lastComplaint)
             emode = DEBUG1;
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 35029cf97d..03a8b42f15 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -121,6 +121,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
         pfree(state);
         return NULL;
     }
+    state->EndOfWAL = false;
     state->errormsg_buf[0] = '\0';
 
     /*
@@ -292,6 +293,7 @@ XLogReadRecord(XLogReaderState *state, char **errormsg)
     /* reset error state */
     *errormsg = NULL;
     state->errormsg_buf[0] = '\0';
+    state->EndOfWAL = false;
 
     ResetDecoder(state);
     state->abortedRecPtr = InvalidXLogRecPtr;
@@ -588,6 +590,15 @@ err:
          */
         state->abortedRecPtr = RecPtr;
         state->missingContrecPtr = targetPagePtr;
+
+        /*
+         * If the message is not set yet, that means we failed to load the
+         * page for the record.  Otherwise do not hide the existing message.
+         */
+        if (state->errormsg_buf[0] == '\0')
+            report_invalid_record(state,
+                                  "missing contrecord at %X/%X",
+                                  LSN_FORMAT_ARGS(RecPtr));
     }
 
     /*
@@ -730,6 +741,40 @@ ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
                       XLogRecPtr PrevRecPtr, XLogRecord *record,
                       bool randAccess)
 {
+    if (record->xl_tot_len == 0)
+    {
+        /*
+         * We are almost sure reaching the end of WAL, make sure that the
+         * whole page after the record is filled with zeroes.
+         */
+        char       *p;
+        char       *pe;
+
+        /* scan from the beginning of the record to the end of block */
+        p = (char *) record;
+        pe = p + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));
+
+        while (*p == 0 && p < pe)
+            p++;
+
+        if (p == pe)
+        {
+            /*
+             * The page after the record is completely zeroed. That suggests
+             * we don't have a record after this point. We don't bother
+             * checking the pages after since they are not zeroed in the case
+             * of recycled segments.
+             */
+            report_invalid_record(state, "empty record found at %X/%X",
+                                  LSN_FORMAT_ARGS(RecPtr));
+
+            /* notify end-of-wal to callers */
+            state->EndOfWAL = true;
+            return false;
+        }
+
+        /* The same condition will be caught as invalid record length */
+    }
     if (record->xl_tot_len < SizeOfXLogRecord)
     {
         report_invalid_record(state,
@@ -836,6 +881,31 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 
     XLogSegNoOffsetToRecPtr(segno, offset, state->segcxt.ws_segsize, recaddr);
 
+    StaticAssertStmt(XLOG_PAGE_MAGIC != 0, "XLOG_PAGE_MAGIC is zero");
+
+    if (hdr->xlp_magic == 0)
+    {
+        /* Regard an empty page as End-Of-WAL */
+        int            i;
+
+        for (i = 0; i < XLOG_BLCKSZ && phdr[i] == 0; i++);
+        if (i == XLOG_BLCKSZ)
+        {
+            char        fname[MAXFNAMELEN];
+
+            XLogFileName(fname, state->seg.ws_tli, segno,
+                         state->segcxt.ws_segsize);
+
+            report_invalid_record(state,
+                                  "empty page in log segment %s, offset %u",
+                                  fname,
+                                  offset);
+            state->EndOfWAL = true;
+            return false;
+        }
+
+        /* The same condition will be caught as invalid magic number */
+    }
     if (hdr->xlp_magic != XLOG_PAGE_MAGIC)
     {
         char        fname[MAXFNAMELEN];
@@ -921,6 +991,14 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
                               LSN_FORMAT_ARGS(hdr->xlp_pageaddr),
                               fname,
                               offset);
+
+        /*
+         * If the page address is less than expected we assume it is an unused
+         * page in a recycled segment.
+         */
+        if (hdr->xlp_pageaddr < recaddr)
+            state->EndOfWAL = true;
+
         return false;
     }
 
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index b39fce8c23..1a7a692bc0 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -471,10 +471,9 @@ WalReceiverMain(void)
                         else if (len < 0)
                         {
                             ereport(LOG,
-                                    (errmsg("replication terminated by primary server"),
-                                     errdetail("End of WAL reached on timeline %u at %X/%X.",
-                                               startpointTLI,
-                                               LSN_FORMAT_ARGS(LogstreamResult.Write))));
+                                    (errmsg("replication terminated by primary server at %X/%X on timeline %u.",
+                                            LSN_FORMAT_ARGS(LogstreamResult.Write),
+                                            startpointTLI)));
                             endofwal = true;
                             break;
                         }
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index a6251e1a96..3745e76488 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -1176,9 +1176,16 @@ main(int argc, char **argv)
         exit(0);
 
     if (errormsg)
-        fatal_error("error in WAL record at %X/%X: %s",
-                    LSN_FORMAT_ARGS(xlogreader_state->ReadRecPtr),
-                    errormsg);
+    {
+        if (xlogreader_state->EndOfWAL)
+            pg_log_info("end of WAL at %X/%X: %s",
+                        LSN_FORMAT_ARGS(xlogreader_state->EndRecPtr),
+                        errormsg);
+        else
+            fatal_error("error in WAL record at %X/%X: %s",
+                        LSN_FORMAT_ARGS(xlogreader_state->EndRecPtr),
+                        errormsg);
+    }
 
     XLogReaderFree(xlogreader_state);
 
diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
index 477f0efe26..3eeba220a1 100644
--- a/src/include/access/xlogreader.h
+++ b/src/include/access/xlogreader.h
@@ -174,6 +174,7 @@ struct XLogReaderState
      */
     XLogRecPtr    ReadRecPtr;        /* start of last record read */
     XLogRecPtr    EndRecPtr;        /* end+1 of last record read */
+    bool        EndOfWAL;        /* the last attempt was EOW? */
 
     /*
      * Set at the end of recovery: the start point of a partial record at the
diff --git a/src/test/recovery/t/011_crash_recovery.pl b/src/test/recovery/t/011_crash_recovery.pl
index 3892aba3e5..67d264df26 100644
--- a/src/test/recovery/t/011_crash_recovery.pl
+++ b/src/test/recovery/t/011_crash_recovery.pl
@@ -10,9 +10,11 @@ use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
 use Config;
+use IPC::Run;
 
-plan tests => 3;
+plan tests => 11;
 
+my $reached_eow_pat = "reached end of WAL at ";
 my $node = PostgreSQL::Test::Cluster->new('primary');
 $node->init(allows_streaming => 1);
 $node->start;
@@ -50,7 +52,15 @@ is($node->safe_psql('postgres', qq[SELECT pg_xact_status('$xid');]),
 
 # Crash and restart the postmaster
 $node->stop('immediate');
+my $logstart = get_log_size($node);
 $node->start;
+my $max_attempts = 360;
+while ($max_attempts-- >= 0)
+{
+    last if (find_in_log($node, $reached_eow_pat, $logstart));
+    sleep 0.5;
+}
+ok ($max_attempts >= 0, "end-of-wal is logged");
 
 # Make sure we really got a new xid
 cmp_ok($node->safe_psql('postgres', 'SELECT pg_current_xact_id()'),
@@ -62,3 +72,101 @@ is($node->safe_psql('postgres', qq[SELECT pg_xact_status('$xid');]),
 
 $stdin .= "\\q\n";
 $tx->finish;    # wait for psql to quit gracefully
+
+my $segsize = $node->safe_psql('postgres',
+       qq[SELECT setting FROM pg_settings WHERE name = 'wal_segment_size';]);
+
+# make sure no records afterwards go to the next segment
+$node->safe_psql('postgres', qq[
+                 SELECT pg_switch_wal();
+                 CHECKPOINT;
+                 CREATE TABLE t();
+]);
+$node->stop('immediate');
+
+# identify REDO WAL file
+my $cmd = "pg_controldata -D " . $node->data_dir();
+my $chkptfile;
+$cmd = ['pg_controldata', '-D', $node->data_dir()];
+$stdout = '';
+$stderr = '';
+IPC::Run::run $cmd, '>', \$stdout, '2>', \$stderr;
+ok($stdout =~ /^Latest checkpoint's REDO WAL file:[ \t] *(.+)$/m,
+   "checkpoint file is identified");
+my $chkptfile = $1;
+
+# identify the last record
+my $walfile = $node->data_dir() . "/pg_wal/$chkptfile";
+$cmd = ['pg_waldump', $walfile];
+$stdout = '';
+$stderr = '';
+my $lastrec;
+IPC::Run::run $cmd, '>', \$stdout, '2>', \$stderr;
+foreach my $l (split(/\r?\n/, $stdout))
+{
+    $lastrec = $l;
+}
+ok(defined $lastrec, "last WAL record is extracted");
+ok($stderr =~ /end of WAL at ([0-9A-F\/]+): .* at \g1/,
+   "pg_waldump emits the correct ending message");
+
+# read the last record LSN excluding leading zeroes
+ok ($lastrec =~ /, lsn: 0\/0*([1-9A-F][0-9A-F]+),/,
+    "LSN of the last record identified");
+my $lastlsn = $1;
+
+# corrupt the last record
+my $offset = hex($lastlsn) % $segsize;
+open(my $segf, '+<', $walfile) or die "failed to open $walfile\n";
+seek($segf, $offset, 0);  # halfway break the last record
+print $segf "\0\0\0\0";
+close($segf);
+
+# pg_waldump complains about the corrupted record
+$stdout = '';
+$stderr = '';
+IPC::Run::run $cmd, '>', \$stdout, '2>', \$stderr;
+ok($stderr =~ /fatal: error in WAL record at 0\/$lastlsn: .* at 0\/$lastlsn/,
+   "pg_waldump emits the correct error message");
+
+# also server complains
+$logstart = get_log_size($node);
+$node->start;
+$max_attempts = 360;
+while ($max_attempts-- >= 0)
+{
+    last if (find_in_log($node, "WARNING:  invalid record length at 0/$lastlsn: wanted [0-9]+, got 0",
+                         $logstart));
+    sleep 0.5;
+}
+ok($max_attempts >= 0, "header error is logged at $lastlsn");
+
+# and the end-of-wal messages shouldn't be seen
+# the same message has been confirmed in the past
+ok(!find_in_log($node, $reached_eow_pat, $logstart),
+   "false log message is not emitted");
+
+$node->stop('immediate');
+
+#### helper routines
+# return the size of logfile of $node in bytes
+sub get_log_size
+{
+    my ($node) = @_;
+
+    return (stat $node->logfile)[7];
+}
+
+# find $pat in logfile of $node after $off-th byte
+sub find_in_log
+{
+    my ($node, $pat, $off) = @_;
+
+    $off = 0 unless defined $off;
+    my $log = PostgreSQL::Test::Utils::slurp_file($node->logfile);
+    return 0 if (length($log) <= $off);
+
+    $log = substr($log, $off);
+
+    return $log =~ m/$pat/;
+}
-- 
2.27.0


pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: decoupling table and index vacuum
Next
From: Peter Smith
Date:
Subject: Re: row filtering for logical replication