Re: Fix slot synchronization with two_phase decoding enabled - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Fix slot synchronization with two_phase decoding enabled
Date
Msg-id CAA4eK1LqWncUOqKijiafe+Ypt1gQAQRjctKLMY953J79xDBgAg@mail.gmail.com
Whole thread Raw
In response to Re: Fix slot synchronization with two_phase decoding enabled  (Amit Kapila <amit.kapila16@gmail.com>)
Responses RE: Fix slot synchronization with two_phase decoding enabled
List pgsql-hackers
On Fri, Apr 18, 2025 at 9:58 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 17, 2025 at 6:14 PM Zhijie Hou (Fujitsu)
> <houzj.fnst@fujitsu.com> wrote:
> >
> > -----
> > Fix
> > -----
> >
> > I think we should keep the confirmed_flush even if the previous synced
> > restart_lsn/catalog_xmin is newer. Attachments include a patch for the same.
> >
>
> This will fix the case we are facing but adds a new rule for slot
> synchronization. Can we think of a simpler way to fix this by avoiding
> updating other slot fields (like two_phase, two_phase_at) if
> restart_lsn or catalog_xmin of the local slot is ahead of the remote
> slot?
>

Thinking more about this problem, it seems to me that if the
catalog_xmin of synced slot is allowed to be ahead than the
remote_slot when there is still an open (prepared transaction), it
could cause data loss.  I mean that after the promotion, some of the
required catalog rows could be removed, and decoding corresponding
changes (changes from tables affected by DDL) could give unexpected
results. Those would be protected on primary/publisher because the
catalog_xmin on it was still accurate and behind. If this theory turns
out to be true, then this is a drawback/bug of the existing
fast_forward mode code.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [PATCH] Documentation: Fix minor grammatical and formatting issues
Next
From: Michael Paquier
Date:
Subject: Re: doc patch: clarify the naming rule for injection_points