Home > mailing lists

Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery - Mailing list pgsql-hackers

From	Xuneng Zhou
Subject	Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery
Date	March 17 04:04:16
Msg-id	CABPTF7UEudN4OAifnORwX3A0OSeZaAA5i0xDRTj97NCuiQMCyg@mail.gmail.com Whole thread Raw
In response to	Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery (Marco Nenciarini <marco.nenciarini@enterprisedb.com>)
Responses	Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery
List	pgsql-hackers

Tree view

Hi,

Thanks for the patch.

On Tue, Mar 17, 2026 at 5:49 AM Marco Nenciarini
<marco.nenciarini@enterprisedb.com> wrote:
>
> Attached is a v2 patch that implements the "handshake clamp" approach
> Xuneng suggested.  Rather than tracking lastStreamedFlush in
> process-local state (which doesn't survive a cascade restart, as
> Fujii-san demonstrated), it uses the WAL flush position already
> returned by IDENTIFY_SYSTEM.
>
> The walreceiver now checks the upstream's flush position before issuing
> START_REPLICATION.  If the requested startpoint is ahead (on the same
> timeline), it waits for wal_retrieve_retry_interval and retries.  This
> works across restarts since it queries the upstream's live position on
> every connection attempt, and requires no new state variables.
>
> When timelines differ, we let START_REPLICATION handle the timeline
> negotiation as before.
>
> The patch includes a TAP test (053_cascade_reconnect.pl) that
> reproduces the scenario and verifies the fix.
>

I haven’t looked into it in detail yet, but it looks good overall.
I’ll test it further and verify that the issue has been resolved.

--
Best,
Xuneng

pgsql-hackers by date:

From: Haibo Yan
Date: 17 March, 03:28:02
Subject: Re: Eliminating SPI / SQL from some RI triggers - take 3

From: Chao Li
Date: 17 March, 04:12:13
Subject: Re: tablecmds: reject CLUSTER ON for partitioned tables earlier

Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery - Mailing list pgsql-hackers

Previous

Next