Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery - Mailing list pgsql-hackers

From Xuneng Zhou
Subject Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery
Date
Msg-id CABPTF7Wp7YuH9=qyM8ESq1QpGGAkq3=nL+F=WmKv7gHpq1XPWQ@mail.gmail.com
Whole thread
In response to Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery  (Marco Nenciarini <marco.nenciarini@enterprisedb.com>)
List pgsql-hackers
On Tue, Mar 17, 2026 at 5:31 PM Marco Nenciarini
<marco.nenciarini@enterprisedb.com> wrote:
>
> Since this bug dates back to 9.3, the fix will likely need backpatching.
> The v2 patch changes the walrcv_identify_system() signature, which would
> be an ABI break on stable branches (walrcv_identify_system_fn is a
> function pointer in the WalReceiverFunctionsType struct).
>
> Attached is a backpatch-compatible variant that avoids the API change.
> Instead of adding a parameter, libpqrcv_identify_system() stores the
> flush position in a new global variable (WalRcvIdentifySystemLsn), and
> the walreceiver reads it directly.  The fix logic and TAP test are
> otherwise identical.
>
> For master I'd still prefer the v2 approach with the extended signature,
> since it's cleaner and there's no ABI constraint.
>
> Best regards,
> Marco

I think that the ABI concern for backpatching is valid, and the
proposed workaround looks reasonable to me. Resetting
WalRcvIdentifySystemLsn before walrcv_identify_system() seems like a
sensible defensive move, so I’ve added it into v3. The TAP test has
also been updated as well.

--
Best,
Xuneng

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Skipping schema changes in publication
Next
From: Jakub Wartak
Date:
Subject: Re: pg_stat_io_histogram