Re: standby promotion can create unreadable WAL - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: standby promotion can create unreadable WAL
Date
Msg-id CAFiTN-t7umki=PK8dT1tcPV=mOUe2vNhHML6b3T7W7qqvvajjg@mail.gmail.com
Whole thread Raw
In response to standby promotion can create unreadable WAL  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: standby promotion can create unreadable WAL
List pgsql-hackers
On Tue, Aug 23, 2022 at 12:06 AM Robert Haas <robertmhaas@gmail.com> wrote:

> Nothing that uses xlogreader is going to be able to bridge the gap
> between file #4 and file #5. In this case it doesn't matter very much,
> because we immediately write a checkpoint record into file #5, so if
> we crash we won't try to replay file #4 anyway. However, if anything
> did try to look at file #4 it would get confused. Maybe that can
> happen if this is a streaming standby, where we only write an
> end-of-recovery record upon promotion, rather than a checkpoint, or
> maybe if there are cascading standbys someone could try to actually
> use the 000000020000000000000004 file for something. I'm not sure. But
> unless I'm missing something, that file is bogus, and our only hope of
> not having problems is that perhaps no one will ever look at it.

Yeah, this analysis looks correct to me.

> I think that the cause of this problem is this code right here:
>
>     /*
>      * Actually, if WAL ended in an incomplete record, skip the parts that
>      * made it through and start writing after the portion that persisted.
>      * (It's critical to first write an OVERWRITE_CONTRECORD message, which
>      * we'll do as soon as we're open for writing new WAL.)
>      */
>     if (!XLogRecPtrIsInvalid(missingContrecPtr))
>     {
>         Assert(!XLogRecPtrIsInvalid(abortedRecPtr));
>         EndOfLog = missingContrecPtr;
>     }

Yeah, this statement as well as another statement that creates the
overwrite contrecord.  After changing these two lines the problem is
fixed for me.  Although I haven't yet thought of all the scenarios
that whether it is safe in all the cases.  I agree that after timeline
changes we are pointing to the end of the last valid record we can
start writing the next record from that point onward.  But I think we
should need to think hard that whether it will break any case for
which the overwrite contrecord was actually introduced.

diff --git a/src/backend/access/transam/xlog.c
b/src/backend/access/transam/xlog.c
index 7602fc8..3d38613 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5491,7 +5491,7 @@ StartupXLOG(void)
         * (It's critical to first write an OVERWRITE_CONTRECORD message, which
         * we'll do as soon as we're open for writing new WAL.)
         */
-       if (!XLogRecPtrIsInvalid(missingContrecPtr))
+       if (newTLI == endOfRecoveryInfo->lastRecTLI &&
!XLogRecPtrIsInvalid(missingContrecPtr))
        {
                Assert(!XLogRecPtrIsInvalid(abortedRecPtr));
                EndOfLog = missingContrecPtr;
@@ -5589,7 +5589,7 @@ StartupXLOG(void)
        LocalSetXLogInsertAllowed();

        /* If necessary, write overwrite-contrecord before doing
anything else */
-       if (!XLogRecPtrIsInvalid(abortedRecPtr))
+       if (newTLI == endOfRecoveryInfo->lastRecTLI &&
!XLogRecPtrIsInvalid(abortedRecPtr))
        {
                Assert(!XLogRecPtrIsInvalid(missingContrecPtr));
                CreateOverwriteContrecordRecord(abortedRecPtr,
missingContrecPtr, newTLI);

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: [PATCH] Optimize json_lex_string by batching character copying
Next
From: Pavel Stehule
Date:
Subject: Re: Schema variables - new implementation for Postgres 15