RE: Fix slot synchronization with two_phase decoding enabled - Mailing list pgsql-hackers

From Zhijie Hou (Fujitsu)
Subject RE: Fix slot synchronization with two_phase decoding enabled
Date
Msg-id OS0PR01MB5716ECCE98DD9B185ADA8C5894AE2@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Fix slot synchronization with two_phase decoding enabled  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Fix slot synchronization with two_phase decoding enabled
List pgsql-hackers
On Thu, Apr 3, 2025 at 3:30 AM Masahiko Sawada wrote:

> 
> On Wed, Apr 2, 2025 at 6:33 AM Zhijie Hou (Fujitsu)
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Wed, Apr 2, 2025 at 3:45 PM Masahiko Sawada wrote:
> >
> > Hi,
> >
> > >
> > > On Mon, Mar 31, 2025 at 4:3 AM Zhijie Hou (Fujitsu)
> > > <houzj.fnst@fujitsu.com> wrote:
> > > > After further analysis, I was able to reproduce the same issue [1] in
> > > > PG 17.
> > > >
> > > > [1]
> > > > - pub: created a slot 'sub' with two_phase=false, then prepared a
> > > > transaction
> > > > - pub: after some activity, advanced the confirmed_flush_lsn of 'sub', so
> it is
> > > >   greater than prepared txn lsn.
> > > > - sub: create subscription with (slot_name='sub', create_slot=false,
> > > > failover =  true, two_phase=true, copy_data=false); two_phase_at will
> > > > be set to the same  as confirmed_flush_lsn which is greater than the
> > > prepared transaction.
> > > > - stop the primary and promote the standby.
> > > > - commit the prepared transaction on standby, the following error will be
> > > >   reported on subscriber:
> > >
> > > It seems to require elaborate steps to reproduce this issue in v17. I wonder
> if we
> > > could somehow narrow down the cases that we want to prohibit. The patch
> for
> > > v17 disallows CREATE SUBSCRIPTION to enable both two_phase and
> failover,
> > > but I guess that it's still safe if it also creates the replication slot (e.g.,
> > > create_slot is true). If my understanding is right, we can allow users to
> specify
> > > both fields if CRETE SUBSCRIPTION creates the slot, and we don't need to
> > > disallow that in ReplicationSlotCreate().
> >
> > Thanks for reviewing the steps.
> >
> > The current reproducer aims for simplicity; however, I think it's possible to
> reproduce
> > the issue even with create_slot = true, although it requires the help of a
> > debugger and additional steps. But as long as there are transactions
> prepared
> > before the two_phase_at position, and they are skipped due to checks in
> > ReorderBufferFinishPrepared() (refer to comments[1] for why we skip
> sending
> > prepared transaction), the issue can be reproduced.
> >
> > For instance, when creating a subscription with (copy_data=true,
> failover=true,
> > two_phase=true), the slot's two_phase setting starts as false and shifts to
> > true after table sync (refer to [2] for related comments). During this period,
> > if a user prepares a transaction where the prepare LSN is less than the
> > two_phase_at, the same problem could happen.
> >
> > Similarly, when setting up a subscription with (copy_data=false,
> failover=true,
> > two_phase=true), although two_phase is initially set to true and we wait for
> > running transactions to finish when building consistent snapshot, a race
> > condition may still exist: If the snapshot state reaches FULL_SNAPSHOT, it
> > won't check running transactions further (see SnapBuildFindSnapshot() for
> > specifics), if a user prepares a transaction at this point, it's possible for
> > the prepare LSN to be less than the LSN marking the snapshot's consistent
> > state, causing the same issue.
> 
> Thank you for the explanation! I agree that the issue happens in these cases.
> 
> As another idea, I wonder if we could somehow defer to make the synced
> slot as 'sync-ready' until we can ensure that the slot doesn't have
> any transactions that are prepared before the point of enabling
> two_phase. For example, when the slotsync worker fetches the remote
> slot, it remembers the confirmed_flush_lsn (say LSN-1) if the local
> slot's two_phase becomes true or the local slot is newly created with
> enabling two_phase, and then it makes the slot 'sync-ready' once it
> confirmed that the slot's restart_lsn passed LSN-1. Does it work?

Thanks for the idea!

We considered a similar approach in [1] to confirm there is no prepared
transactions before two_phase_at, but the issue is that when the two_phase flag
is switched from 'false' to 'true' (as in the case with (copy_data=true,
failover=true, two_phase=true)). In this case, the slot may have already been
marked as sync-ready before the two_phase flag is enabled, as slotsync is
unaware of potential future changes to the two_phase flag. Then the slot have
to revert to sync-not-ready after the two_phase flag change, which could be
difficult for users to understand.

What do you think ?

[1]
https://www.postgresql.org/message-id/OS0PR01MB57161D9BB5409F229564957994AD2%40OS0PR01MB5716.jpnprd01.prod.outlook.com

Best Regards,
Hou zj

pgsql-hackers by date:

Previous
From: "Gurjeet Singh"
Date:
Subject: Re: [PATCH] Re: Proposal to Enable/Disable Index using ALTER INDEX
Next
From: Nathan Bossart
Date:
Subject: Re: Statistics Import and Export