Re: Newly created replication slot may be invalidated by checkpoint - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Newly created replication slot may be invalidated by checkpoint
Date
Msg-id CAA4eK1KEZ8b+MMD_vVXPe3AA_eSkrfknFaed5Yhn-sysCuZEsQ@mail.gmail.com
Whole thread Raw
In response to RE: Newly created replication slot may be invalidated by checkpoint  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
List pgsql-hackers
On Thu, Nov 20, 2025 at 4:07 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Thursday, November 20, 2025 4:26 PM Vitaly Davydov <v.davydov@postgrespro.ru> wrote:
>
> >
> > Concerning reserve_wal_for_local_slot. It seems it is used in synchronization
> > of failover logical slots. For me, it is tricky to change restart_lsn of a synced
> > logical slot to RedoRecPtr, because it may lead to problems with logical
> > replication using such slot after the replica promotion. But it seems it is the
> > architectural problem and it is not related to the problems, solved by the
> > patch.
>
> I think this is not an issue because if we use the redo pointer instead of the
> remote restart_lsn as the initial value, the synced slot won't be marked as
> sync-ready, so user cannot use it after promotion (also well documented). This
> is also the existing behavior before the patch, e.g., if the required WALs were
> removed, the oldest available WAL was used as the initial value, similarly
> resulting in the slot not being sync-ready.
>

Would it be better to discuss this in a separate thread? Though this
is related to original problem but still in a separate part of code
(slotsync) which I think can have a separate fix especially when the
fix is also somewhat different.

> >
> > The change of lock mode to EXCLUSIVE in
> > ReplicationSlotsComputeRequiredLSN may affect the performance when a lot
> > of slots are advanced during some small period of time. It may affect the
> > walsender performance. It advances the logical or physical slots when receive
> > a confirmation from the replica. I guess, the slot advancement may be pretty
> > frequent operation.
>
> Yes, I had the same thought and considered a simple alternative (similar to your
> suggestion below): use an exclusive lock only when updating the slot.restart_lsn
> during WAL reservation, while continuing to use a shared lock in the computation
> function. Additionally, place XLogSetReplicationSlotMinimumLSN() under the lock.
> This approach will also help serialize the process.
>

Can we discuss this as well in a separate thread?

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: 10% drop in code line count in PG 17
Next
From: John Naylor
Date:
Subject: Re: tuple radix sort