Re: Fix slot synchronization with two_phase decoding enabled - Mailing list pgsql-hackers

From shveta malik
Subject Re: Fix slot synchronization with two_phase decoding enabled
Date
Msg-id CAJpy0uDuGz0B+zACCiWt9GOGA2jH_+r_zM=9c2h2MKacd+qYkQ@mail.gmail.com
Whole thread Raw
In response to Fix slot synchronization with two_phase decoding enabled  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
List pgsql-hackers
On Tue, Jun 17, 2025 at 5:17 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jun 11, 2025 at 9:50 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Wed, Jun 11, 2025 at 11:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > >
> > > BTW have we addressed the point Amit mentioned before[1]?
> > >
> > > > The one more combination to consider is when someone takes a dump of
> > > > an older version and loads it into a newer version. For example, where
> > > > users dump from 17.5 and then restore in a newer version, say 17.6
> > > > (which has our fix), the restore will fail due to newer restrictions
> > > > added by this patch. Do we need to do anything about it?
> > >
> > > I think it could be a significant side-effect and we need to do
> > > something about that.
> > >
> >
> > After giving it more thought, we have an opinion that this
> > side-effect/issue is unlikely to occur if users follow our
> > documentation properly.
> >
> > The recommended approach for upgrading between minor versions is to
> > shut down the server and replace the binaries. See 'To update between
> > compatible versions' in [1].
> >
> > Also it is recommended in docs that we use pg_dump from the newer
> > version of PostgreSQL. See 'It is recommended that you use the
> > pg_dump' in [2]. This particular recommendation is in the Upgrade doc.
> > If needed, we can make a similar recommendation in any of our failover
> > specific docs as well, mentioning this particular case.
> >
> > In brief, our overall understanding is that a) pg_dump is mainly used
> > for major versions upgrade b) pg_dump of higher version is used.
> > Please let us know if your understanding is different here.
>
> I agree that the main use case of pg_dump is major version upgrading
> but it's not limited to that use case. There might be some users who
> have taken backups using pg_dump for truly backup purposes. I'm not
> sure we've had such compatibility breakage in a minor version release.
>
> > Beyond these steps, we could not find any better solution for the
> > pointed case. But we are open to exploring and implementing any
> > alternative solutions you may have. Feedback is most welcome here.
>
> I'll share alternative ideas if I come up.
>

There was another idea proposed in [1] in the beginning, quoting it here:

-------------
Another idea considered is to prevent the slot that enables two-phase
decoding from being synced to standby. IOW, this means displaying the
failover field as false in the view, if there is any possibility that
transactions prepared before the two_phase_at position exist (e.g., if
restart_lsn is less than two_phase_at).
-------------

I believe this approach has lesser limitations and maintains
compatibility with the existing pg_dump workflow.

The proposal is to show failover as false in pg_replication_slots when
restart_lsn is earlier than two_phase_at. This scenario can only occur
during the activation of two_phase. In PG17, since enabling two_phase
via ALTER SUBSCRIPTION is not permitted, this situation effectively
happens only at the creation of the subscription. At that point, if
slot synchronization is attempted, it will not synchronize the slot.
Synchronization (and creation of the new slot) will begin only once
restart_lsn surpasses two_phase_at.

The only limitation I can identify here is if a failover occurs during
that brief window, the standby might not yet have the slot created.
However, I believe the likelihood of this happening is quite low.
Overall this looks a safer approach.

Please let me know your thoughts on this.

 [1]:
https://www.postgresql.org/message-id/OS0PR01MB57161D9BB5409F229564957994AD2%40OS0PR01MB5716.jpnprd01.prod.outlook.com

thanks
Shveta



pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options
Next
From: Nazir Bilal Yavuz
Date:
Subject: Re: minimum Meson version