Re: standby promotion can create unreadable WAL - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: standby promotion can create unreadable WAL
Date
Msg-id CAFiTN-v_0PYOEdO6X2SRyJ4JQS05RPcpC7FrSmVDdNatsZkaKQ@mail.gmail.com
Whole thread Raw
In response to standby promotion can create unreadable WAL  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: standby promotion can create unreadable WAL
Re: standby promotion can create unreadable WAL
List pgsql-hackers
On Tue, Aug 23, 2022 at 12:06 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
However, if anything
> did try to look at file #4 it would get confused. Maybe that can
> happen if this is a streaming standby, where we only write an
> end-of-recovery record upon promotion, rather than a checkpoint, or
> maybe if there are cascading standbys someone could try to actually
> use the 000000020000000000000004 file for something. I'm not sure. But
> unless I'm missing something, that file is bogus, and our only hope of
> not having problems is that perhaps no one will ever look at it.

I tried in streaming mode, but it seems in the streaming mode we will
never create this bogus file because of this check [1].  So if the
StandbyMode is true then we are never setting "abortedRecPtr" and
"missingContrecPtr" which means we will never create that 0-filled gap
in the WAL file that we are discussing in this thread.

Do we need to set it?  I feel we don't.  Why? because on this thread
we are also discussing that if the timeline switches then we don’t
need to create that 0-filled gap and that is the actual problem we are
discussing here.  And we know that if we are coming out of the
StandbyMode then we will always switch the timeline so we don’t create
that 0-filled gap.  OTOH if we are coming out of the archive recovery
then also we will switch the timeline so in that case also we do not
need that.  So practically we need to 0 fill that partial record only
when we are doing the crash recovery is that understanding correct?
If so then we can simply avoid setting these variables if
ArchiveRecoveryRequested is true.  So in the below check[1] instead of
(!StandbyMode), we can just put (! ArchiveRecoveryRequested), and then
we don't need any other fix.  Am I missing anything?

[1]
ReadRecord{
..record = XLogPrefetcherReadRecord(xlogprefetcher, &errormsg);
        if (record == NULL)
        {
            /*
             * When not in standby mode we find that WAL ends in an incomplete
             * record, keep track of that record.  After recovery is done,
             * we’ll write a record to indicate to downstream WAL readers that
             * that portion is to be ignored.
             */
            if (!StandbyMode &&
                !XLogRecPtrIsInvalid(xlogreader->abortedRecPtr))
            {
                abortedRecPtr = xlogreader->abortedRecPtr;
                missingContrecPtr = xlogreader->missingContrecPtr;
            }

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Andrey Lepikhov
Date:
Subject: Re: Removing unneeded self joins
Next
From: Dave Page
Date:
Subject: Re: Tracking last scan time