On Monday, February 24, 2025 10:46 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
(The previous email was blocked, here is another attempt)
>
> On Sat, Feb 22, 2025 at 11:26 AM Shlok Kyal <shlok.kyal.oss@gmail.com>
> wrote:
> >
> > Adding Amit to the thread.
> >
>
> Thanks.
>
> Looking at the original complaint:
>
> > It says
> > * To avoid potential issues with the slot synchronization
> where the
> > * restart_lsn of a replication slot can go backward, we set
> the
> > * failover option to false here. This situation occurs when
> a slot
> > * on the primary server is dropped and immediately
> replaced with a
> > * new slot of the same name, created by copying from
> another existing
> > * slot. However, the slot synchronization will only observe
> the
> > * restart_lsn of the same slot going backward.
>
> It would be better to update the comments as well to make the potential issues
> with slot synchronization clear or mention the reference of other place where
> we have comments related to this race condition. Also, I think it is better to
> write about two_phase in the comments as well as in docs unless already
> mentioned. If you agree with updating the comments as well, shall we redirect
> this to hackers?
(Redirect to -hackers)
>
> > >
> > > IIUC the two_phase field is also not copied from the source slot. I
> > > think we can clarify in the doc that these two fields are not copied.
> > >
>
> +1.
Here is the new V3 patch set which updated the comments to make the
issue clearer.
After thinking more on the two_phase option, I didn't find an issue that prevent
us from copying its option value. So, it’s more intuitive to me to just
copy its value instead of adding doc for it. The 0002 includes the same and I will
keep testing to ensure there are no other issues missed.
Best Regards,
Hou zj