Re: Potential data loss due to race condition during logical replication slot creation - Mailing list pgsql-bugs

From Masahiko Sawada
Subject Re: Potential data loss due to race condition during logical replication slot creation
Date
Msg-id CAD21AoBkbFZ8Fr51DY9fBWtpP-+3RGVrMtH_O96V8vgSs5eqkw@mail.gmail.com
Whole thread Raw
In response to Re: Potential data loss due to race condition during logical replication slot creation  (Michael Paquier <michael@paquier.xyz>)
List pgsql-bugs
On Mon, Jul 8, 2024 at 8:56 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Jul 05, 2024 at 11:52:52PM +0900, Masahiko Sawada wrote:
> > I've attached updated patches for HEAD and pg17 for now (I will create
> > the patch for other backbranches).
> >
> > In the patches, I used a different approach in between HEAD and
> > backbranches. On HEAD, we store a flag indicating whether or not we
> > should skip snapshot restores into the SnapBuild struct and set it
> > only while finding the start point. Therefore we have to bump
> > SNAPBUILD_VERSION. On backbranches, I used the approach we discussed
> > above; store the flag in LogicalDecodingContext and set it when
> > creating the LogicalDecodingContext for a new logical slot. A possible
> > downside of the approach taken for backbranches is that we implicitly
> > require for users to use the same LogicalDecodingContext for  both
> > initializing the context for a new slot and finding its start point.
> > IIUC it was not strictly required. This restriction would not be a
> > problem at least in the core, but I'm not sure if there are no
> > external extensions that create a logical slot in that way. This is
> > the main reason why I used a different approach on HEAD and
> > backbranches. Therefore, if it does not matter, I think we can use the
> > same approach on all branches, which is better in terms of
> > maintainability.
>
> --- a/src/backend/replication/logical/snapbuild.c
> +++ b/src/backend/replication/logical/snapbuild.c
> @@ -189,6 +189,9 @@ struct SnapBuild
>         /* Indicates if we are building full snapshot or just catalog one. */
>         bool            building_full_snapshot;
>
> +       /* Indicates if we are finding the start point to extract changes */
> +       bool            finding_start_point;
> +
>
> FYI, I think that it is still OK to bump SNAPBUILD_VERSION on
> REL_17_STABLE.  That will reduce by 1 year the time window required to
> maintain the tweaks implemented for the versions in the back-branches.
> So I'd suggest to do what the v17 version of the patch does for ~16,
> and use the snapshot format changes in 17~.

Thank you for the comment.

Agreed. If we get consensus on this approach to go, I'll use the patch
changing the SnapBuild struct for 17 and HEAD.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-bugs by date:

Previous
From: Richard Guo
Date:
Subject: Re: BUG #18522: Wrong results with Merge Right Anti Join, inconsistent with Merge Anti Join
Next
From:
Date:
Subject: AW: Error when installing PostgreSQL 16.3.2 under system context NT AUTHORITY\SYSTEM