On 2021-Jul-28, Bossart, Nathan wrote:
> On 7/27/21, 6:05 PM, "Alvaro Herrera" <alvherre@alvh.no-ip.org> wrote:
> > I'm not sure I understand what's the reason not to store this value in
> > pg_control; I feel like I'm missing something. Can you please explain?
>
> Thanks for taking a look.
>
> The only reason I can think of is that it could make back-patching
> difficult. I don't mind working on a version of the patch that uses
> pg_control. Back-patching this fix might be a stretch, anyway.
Hmm ... I'm not sure we're prepared to backpatch this kind of change.
It seems a bit too disruptive to how replay works. I think patch we
should be focusing solely on patch 0001 to surgically fix the precise
bug you see. Does patch 0002 exist because you think that a system with
only 0001 will not correctly deal with a crash at the right time?
Now, the reason I'm looking at this patch series is that we're seeing a
related problem with walsender/walreceiver, which apparently are capable
of creating a file in the replica that ends up not existing in the
primary after a crash, for a reason closely related to what you
describe for WAL archival. I'm not sure what is going on just yet, so
I'm not going to try and explain because I'm likely to get it wrong.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/