Re: Improve error handling for invalid slots and ensure a same 'inactive_since' time for inactive slots - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Improve error handling for invalid slots and ensure a same 'inactive_since' time for inactive slots
Date
Msg-id CAA4eK1LvBbe_iaFQMSkG8Q0OmyELu0fKZ8_aUgeb6mXVFhgkOQ@mail.gmail.com
Whole thread Raw
In response to Re: Improve error handling for invalid slots and ensure a same 'inactive_since' time for inactive slots  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Wed, Jan 29, 2025 at 7:50 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Some review comments for patch v1-0001.
>
> ======
> src/backend/replication/logical/slotsync.c
>
> ReplSlotSyncWorkerMain:
>
> 1.
> + /* Use same inactive_since time for all slots */
> + now = GetCurrentTimestamp();
> +
>   LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
>
>   for (int i = 0; i < max_replication_slots; i++)
> @@ -1537,10 +1540,6 @@ update_synced_slots_inactive_since(void)
>   /* The slot must not be acquired by any process */
>   Assert(s->active_pid == 0);
>
> - /* Use the same inactive_since time for all the slots. */
> - if (now == 0)
> - now = GetCurrentTimestamp();
> -
>
> AFAICT, this code was *already* ensuring to use the same
> 'inactive_since' even before your patch. The only difference is now
> you are getting the timestamp value up-front instead of a deferred
> assignment.
>

I find the code without a patch better as it may sometimes skip to
call GetCurrentTimestamp().

> So why did you change this (and the code of RestoreSlotFromDisk) to do
> the up-front assignment? Instead, you could have chosen to just leave
> this code as-is, and then modify the RestoreSlotFromDisk code to match
> it.
>
> FWIW, I do prefer what you have done here because it is simpler, but I
> just wondered about the choice because I think some people worry about
> GetCurrentTimestamp overheads and try to avoid calling that wherever
> possible.
>
> ======
> src/backend/replication/slot.c
>
> 2. What about other loops?
>
> AFAICT there are still some other loops where the inactive_since
> timestamps might differ.
>
> e.g. How about this logic in slot.c:
>
> InvalidateObsoleteReplicationSlots:
>
> LOOP:
> for (int i = 0; i < max_replication_slots; i++)
> {
>   ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
>
>   calls InvalidatePossiblyObsoleteSlot(...)
>       which calls ReplicationSlotRelease(...)
>           which assigns now = GetCurrentTimestamp();
>               then slot->inactive_since = now;
> }
>
> ~
>
> So, should you also assign a 'now' value outside this loop and pass
> that timestamp down the calls so they eventually all get assigned the
> same? I don't know, but I guess at least that would require much fewer
> unnecessary calls to GetCurrentTimestamp so that may be a good thing.
>

I don't see this as an optimization worth the effort of changing the
code. This gets called infrequently enough to matter. The same is true
for the code in RestoreSlotFromDisk().

So, overall, I think we should just reject the 0001 patch and focus on 0002.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Next
From: vignesh C
Date:
Subject: Re: CREATE PUBLICATION and foreign partitions