Re: Switching XLog source from archive to streaming when primary available - Mailing list pgsql-hackers

From John H
Subject Re: Switching XLog source from archive to streaming when primary available
Date
Msg-id CA+-JvFvw4Qz+Jo16cUbAAWsT__nLzg2MHQQYaEQ+xX-AMBnxEQ@mail.gmail.com
Whole thread Raw
In response to Re: Switching XLog source from archive to streaming when primary available  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Switching XLog source from archive to streaming when primary available
List pgsql-hackers
Hi,

I took a brief look at the patch.

For a motivation aspect I can see this being useful
synchronous_replicas if you have commit set to flush mode.
So +1 on feature, easier configurability, although thinking about it
more you could probably have the restore script be smarter and provide
non-zero exit codes periodically.

The patch needs to be rebased but I tested this against an older 17 build.

> + ereport(DEBUG1,
> + errmsg_internal("switched WAL source from %s to %s after %s",
> + xlogSourceNames[oldSource],

Not sure if you're intentionally changing from DEBUG1 from DEBUG2.

> * standby and increase the replication lag on primary.

Do you mean "increase replication lag on standby"?
nit: reading from archive *could* be faster since you could in theory
it's not single-processed/threaded.

> However,
> + * exhaust all the WAL present in pg_wal before switching. If successful,
> + * the state machine moves to XLOG_FROM_STREAM state, otherwise it falls
> + * back to XLOG_FROM_ARCHIVE state.

I think I'm missing how this happens. Or what "successful" means. If I'm reading
it right, no matter what happens we will always move to
XLOG_FROM_STREAM based on how
the state machine works?

I tested this in a basic RR setup without replication slots (e.g. log
shipping) where the
WAL is available in the archive but the primary always has the WAL
rotated out and
'streaming_replication_retry_interval = 1'. This leads the RR to
become stuck where it stops fetching from
archive and loops between XLOG_FROM_PG_WAL and XLOG_FROM_STREAM.

When 'streaming_replication_retry_interval' is breached, we transition
from {currentSource, wal_source_switch_state}

{XLOG_FROM_ARCHIVE, SWITCH_TO_STREAMING_NONE} -> {XLOG_FROM_ARCHIVE,
SWITCH_TO_STREAMING_PENDING} with readFrom = XLOG_FROM_PG_WAL.

That reads the last record successfully in pg_wal and then fails to
read the next one because it doesn't exist, transitioning to

{XLOG_FROM_STREAM, SWITCH_TO_STREAMING_PENDING}.

XLOG_FROM_STREAM fails because the WAL is no longer there on primary,
it sets it back to {XLOG_FROM_ARCHIVE, SWITCH_TO_STREAMING_PENDING}.

> last_fail_time = now;
> currentSource = XLOG_FROM_ARCHIVE;
> break;

Since the state is still SWITCH_TO_STREAMING_PENDING from the previous
loops, it forces

>                   Assert(currentSource == XLOG_FROM_ARCHIVE);
>                   readFrom = XLOG_FROM_PG_WAL;
>                   ...
>                   readFile = XLogFileReadAnyTLI(readSegNo, DEBUG2, readFrom);

And this readFile call seems to always succeed since it can read the
latest WAL record but not the next one, which is in archive, leading
to transition back to XLOG_FROM_STREAMING and repeats.


>  /*
>  * Nope, not found in archive or pg_wal.
> */
> lastSourceFailed = true;

I don't think this gets triggered for XLOG_FROM_PG_WAL case, which
means the safety
check you added doesn't actually kick in.

>               if (wal_source_switch_state == SWITCH_TO_STREAMING_PENDING)
>               {
>                     wal_source_switch_state = SWITCH_TO_STREAMING;
>                     elog(LOG, "SWITCH_TO_STREAMING_PENDING TO SWITCH_TO_STREAMING");
>                }

Thanks
-- 
John Hsu - Amazon Web Services



pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: RFC: Additional Directory for Extensions
Next
From: Tomas Vondra
Date:
Subject: Re: Partial aggregates pushdown