On Mon, Mar 31, 2025 at 5:04 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Thu, Mar 27, 2025 at 2:29 PM Amit Kapila wrote:
>
> >
> > I suspect that this can happen in PG17 as well, but I need to think
> > more about it to make a reproducible test case.
>
> After further analysis, I was able to reproduce the same issue [1] in
> PG 17.
>
> However, since the proposed fix requires catalog changes and the issue is not a
> security risk significant enough to justify changing the catalog in back
> branches, we cannot back-patch the same solution.
>
Agreed. In the past, as in commit b6e39ca92e, we have backported a
catalog-modifying commit, but that is for a CVE. Users need to follow
manual steps as explained in 9.6.4 release notes [1], which would be
cumbersome for them. This is not a security issue, so we shouldn't
backpatch a catalog modifying commit following past.
> Following off-list
> discussions with Amit and Kuroda-san, we are considering disallowing enabling
> failover and two-phase decoding together for a replication slot, as suggested
> in attachment 0002.
>
> Another idea considered is to prevent the slot that enables two-phase decoding
> from being synced to standby. IOW, this means displaying the failover field as
> false in the view, if there is any possibility that transactions prepared
> before the two_phase_at position exist (e.g., if restart_lsn is less than
> two_phase_at). However, implementing this change would require additional
> explanations to users for this new behavior, which seems tricky.
>
I find it tricky to explain to users. We need to say that sometimes
the slots won't be synced even if the failover is set to true. Users
can verify that by checking slot properties on the publisher. Also, on
the subscriber, the failover flag in the subscription may still be
true unless we do more engineering to make it false. So, I prefer to
simply disallow setting failover and two_phase together. We need to
recommend to users in release notes for 17 that they need to disable
failover for subscriptions where two_phase is enabled or re-create the
subscriptions with two_phase=false and failover=true. Users may not
like it, but I think it is better than getting a complaint that after
promotion of standby the data to subscriber is not getting replicated.
[1] - https://www.postgresql.org/docs/9.6/release-9-6-4.html
--
With Regards,
Amit Kapila.