Re: Potential data loss due to race condition during logical replication slot creation - Mailing list pgsql-bugs

From Amit Kapila
Subject Re: Potential data loss due to race condition during logical replication slot creation
Date
Msg-id CAA4eK1LbpagXYw6eP+qBz2SYjQ3x26ZdVCJBkp033aExqA2MbQ@mail.gmail.com
Whole thread Raw
In response to Re: Potential data loss due to race condition during logical replication slot creation  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Potential data loss due to race condition during logical replication slot creation
List pgsql-bugs
On Mon, Jun 24, 2024 at 10:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jun 24, 2024 at 12:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jun 21, 2024 at 12:16 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > The approach (a) has a downside, it will lead to tracking more
> > > > transactions (non-catalog) than required without any benefit for the
> > > > user. Considering that is true, I wouldn't prefer that approach.
> > >
> > > Yes, it will lead to tracking non-catalog-change transactions as well.
> > > If there are many subtransactions, the overhead could be noticeable.
> > > But it happens only once when creating a slot.
> > >
> >
> > True, but it doesn't seem advisable to add such an overhead even
> > during create time without any concrete reason.
> >
> > > Another variant of (a) is that we skip snapshot restores if the
> > > initial_xmin_hirizon is a valid transaction id. The
> > > initia_xmin_horizon is always set to a valida transaction id when
> > > initializing the decoding context, e.g. during
> > > CreateInitDecodingContext(). That way, we don't need to track
> > > non-catalog-change transctions. A downside is that this approach
> > > assumes that DecodingContextFindStartpoint() is called with the
> > > decoding context created by CreateInitDecodingContxt(), which is true
> > > in the core codes, but might not be true in third party extensions.
> > >
> >
> > I think it is better to be explicit in this case rather than relying
> > on initia_xmin_horizon. So, we can store in_create/create_in_progress
> > flag in the Snapbuild in HEAD and store it in LogicalDecodingContext
> > in back branches.
>
> I think we cannot access the flag in LogicalDecodingContext from
> snapbuild.c at least in backbranches. I've discussed adding such a
> flag in snapbuild.c as a global variable, but I'm slightly hesitant to
> add a global variable besides InitialRunningXacts.
>

I agree that adding a global variable is not advisable. Can we pass
the flag stored in LogicalDecodingContext to snapbuild.c? That might
not be elegant but I don't have any better ideas.

> >  I think changing SnapBuild means we have to update
> > SNAPBUILD_VERSION, right? Is that a good idea to do at this point of
> > time or shall we wait new branch to open and change it there? Anyway,
> > it would be a few days away and in the meantime, we can review and
> > keep the patches ready.
>
> I think we should wait to add such changes that break on-disk
> compatibility until a new branch opens. On HEAD, I think we can add a
> new flag in SnapBuild and set it during say
> DecodingContextFindStartpoint().
>

Fair enough.

--
With Regards,
Amit Kapila.



pgsql-bugs by date:

Previous
From: Antti Lampinen
Date:
Subject: Re: BUG #18522: Wrong results with Merge Right Anti Join, inconsistent with Merge Anti Join
Next
From: Bowen Shi
Date:
Subject: PG16 walsender hangs in ResourceArrayEnlarge using pgoutput