Re: Simplify standby state machine a bit in WaitForWALToBecomeAvailable() - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Simplify standby state machine a bit in WaitForWALToBecomeAvailable()
Date
Msg-id CALj2ACVHQstDq4HbahLZ8m16swCMDMNDNKgnU_Jn7_AE9DvG5w@mail.gmail.com
Whole thread Raw
In response to Re: Simplify standby state machine a bit in WaitForWALToBecomeAvailable()  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Simplify standby state machine a bit in WaitForWALToBecomeAvailable()
Re: Simplify standby state machine a bit in WaitForWALToBecomeAvailable()
List pgsql-hackers
On Tue, Jan 3, 2023 at 7:47 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Dec 30, 2022 at 10:32:57AM -0800, Nathan Bossart wrote:
> > This looks correct to me.  The only thing that stood out to me was the loop
> > through 'tles' in XLogFileReadyAnyTLI.  With this change, we'd loop through
> > the timelines for both XLOG_FROM_PG_ARCHIVE and XLOG_FROM_PG_WAL, whereas
> > now we only loop through the timelines once.  However, I doubt this makes
> > much difference in practice.  You'd only do the extra loop whenever
> > restoring from the archives failed.
>
>         case XLOG_FROM_ARCHIVE:
> +
> +           /*
> +            * After failing to read from archive, we try to read from
> +            * pg_wal.
> +            */
> +           currentSource = XLOG_FROM_PG_WAL;
> +           break;
> In standby mode, the priority lookup order is pg_wal -> archive ->
> stream.  With this change, we would do pg_wal -> archive -> pg_wal ->
> stream, meaning that it could influence some recovery scenarios while
> involving more lookups than necessary to the local pg_wal/ directory?
>
> See, on failure where the current source is XLOG_FROM_ARCHIVE, we
> would not switch anymore directly to XLOG_FROM_STREAM.

I think there's a bit of disconnect here - here's what I understand:

Standby when started can either enter to crash recovery (if it is a
restart after crash) or enter to archive recovery directly.

The standby, when in crash recovery:
currentSource is set to XLOG_FROM_PG_WAL in
WaitForWALToBecomeAvailable() and it continues to exhaust replaying
all the WAL in the pg_wal directory.
After all the pg_wal is exhausted during crash recovery, currentSource
is set to XLOG_FROM_ANY in ReadRecord() and the standby enters archive
recovery mode (see below).

The standby, when in archive recovery:
In WaitForWALToBecomeAvailable() currentSource is set to
XLOG_FROM_ARCHIVE and it enters XLogFileReadAnyTLI() - first tries to
fetch WAL from archive and returns if succeeds otherwise tries to
fetch from pg_wal and returns if succeeds, otherwise returns with
failure.
If failure is returned from XLogFileReadAnyTLI(), change the
currentSource to XLOG_FROM_STREAM.
If a failure in XLOG_FROM_STREAM, the currentSource is set to
XLOG_FROM_ARCHIVE and XLogFileReadAnyTLI() is called again.

Note that the standby exits from this WaitForWALToBecomeAvailable()
state machine when the promotion signal is detected and before which
all the wal from archive -> pg_wal is exhausted.

Note that currentSource is set to XLOG_FROM_PG_WAL in
WaitForWALToBecomeAvailable() only after the server exits archive
recovery i.e. InArchiveRecovery is set to false in
FinishWalRecovery(). However, exhausting pg_wal for recovery is built
inherently within XLogFileReadAnyTLI().

In summary:
the flow when the standby is in crash recovery is pg_wal -> [archive
-> pg_wal -> stream] -> [archive -> pg_wal -> stream] -> [] -> [] ...
the flow when the standby is in archive recovery is [archive -> pg_wal
-> stream] -> [archive -> pg_wal -> stream] -> [] -> [] ...

The proposed patch makes the inherent state change to pg_wal after
failure to read from archive in XLogFileReadAnyTLI() to explicit by
setting currentSource to XLOG_FROM_PG_WAL in the state machine. I
think it doesn't alter the existing state machine or add any new extra
lookups in pg_wal.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Richard Guo
Date:
Subject: Re: An oversight in ExecInitAgg for grouping sets
Next
From: Bharath Rupireddy
Date:
Subject: Re: Simplify standby state machine a bit in WaitForWALToBecomeAvailable()