Re: Simplify standby state machine a bit in WaitForWALToBecomeAvailable() - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Simplify standby state machine a bit in WaitForWALToBecomeAvailable()
Date
Msg-id CALj2ACXLFcAGap=wVUehfM9pwKaZWLhsmdbq2taK=G4OBQR3aQ@mail.gmail.com
Whole thread Raw
In response to Re: Simplify standby state machine a bit in WaitForWALToBecomeAvailable()  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Simplify standby state machine a bit in WaitForWALToBecomeAvailable()  (Michael Paquier <michael@paquier.xyz>)
Re: Simplify standby state machine a bit in WaitForWALToBecomeAvailable()  (Nathan Bossart <nathandbossart@gmail.com>)
List pgsql-hackers
On Wed, Mar 1, 2023 at 1:46 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Jan 03, 2023 at 02:53:10PM +0530, Bharath Rupireddy wrote:
> > The proposed patch makes the inherent state change to pg_wal after
> > failure to read from archive in XLogFileReadAnyTLI() to explicit by
> > setting currentSource to XLOG_FROM_PG_WAL in the state machine. I
> > think it doesn't alter the existing state machine or add any new extra
> > lookups in pg_wal.
>
> Well, did you notice 4d894b41?  It introduced this change:
>
> -               readFile = XLogFileReadAnyTLI(readSegNo, DEBUG2, currentSource);
> +               readFile = XLogFileReadAnyTLI(readSegNo, DEBUG2,
> +                       currentSource == XLOG_FROM_ARCHIVE ? XLOG_FROM_ANY :
> +                                        currentSource);
>
> And this patch basically undoes that, meaning that we would basically
> look at the archives first for all the expected TLIs, but only if no
> files were found in pg_wal/.
>
> The change is subtle, see XLogFileReadAnyTLI().  On HEAD we go through
> each timeline listed and check both archives and then pg_wal/ after
> the last source that failed was the archives.  The patch does
> something different: it checks all the timelines for the archives,
> then all the timelines in pg_wal/ with two separate calls to
> XLogFileReadAnyTLI().

Thanks. Yeah, the patch proposed here just reverts that commit [1]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=4d894b41cd12179b710526eba9dc62c2b99abc4d.

That commit fixes an issue - "If there is a WAL segment with same ID
but different TLI present in both the WAL archive and pg_xlog, prefer
the one with higher TLI.".

I will withdraw the patch proposed in this thread.

[1]
commit 4d894b41cd12179b710526eba9dc62c2b99abc4d
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date:   Fri Feb 14 15:15:09 2014 +0200

    Change the order that pg_xlog and WAL archive are polled for WAL segments.

    If there is a WAL segment with same ID but different TLI present in both
    the WAL archive and pg_xlog, prefer the one with higher TLI. Before this
    patch, the archive was polled first, for all expected TLIs, and only if no
    file was found was pg_xlog scanned. This was a change in behavior from 9.3,
    which first scanned archive and pg_xlog for the highest TLI, then archive
    and pg_xlog for the next highest TLI and so forth. This patch reverts the
    behavior back to what it was in 9.2.

    The reason for this is that if for example you try to do archive recovery
    to timeline 2, which branched off timeline 1, but the WAL for timeline 2 is
    not archived yet, we would replay past the timeline switch point on
    timeline 1 using the archived files, before even looking timeline 2's files
    in pg_xlog

    Report and patch by Kyotaro Horiguchi. Backpatch to 9.3 where the behavior
    was changed.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Anthonin Bonnefoy
Date:
Subject: Re: Flush SLRU counters in checkpointer process
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher