Re: Question about InvalidatePossiblyObsoleteSlot() - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Question about InvalidatePossiblyObsoleteSlot()
Date
Msg-id CAD21AoAFn_uPGDXsmg42wUXRZ7_yhiUyDAa8WyCW14DRCPZqtg@mail.gmail.com
Whole thread Raw
In response to Re: Question about InvalidatePossiblyObsoleteSlot()  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Question about InvalidatePossiblyObsoleteSlot()
List pgsql-hackers
On Wed, Oct 29, 2025 at 12:02 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Oct 28, 2025 at 11:53:26AM -0700, Masahiko Sawada wrote:
> > On Tue, Oct 28, 2025 at 1:58 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote:
>
> Apologies for not following closely what has been happening on this
> thread.  I have just not been paying much attention, and Sawada-san
> has just poked at me about its existence today, also regarding the
> fact that my fingerprints are all over the place (implying that I'm an
> owner here, even in light of 105b2cb3361 that has introduced the trick
> to bypass the creation of the annoying standby snapshot records that
> distabilized the tests).
>
> >> I see, you mean that the tests are stable now (thanks to 105b2cb3361) but
> >> that we should still do something for "production" cases? (i.e not making use
> >> of injection points).
> >
> > Yes. While it seems we might want to review the past discussion, if
> > we've concluded such behavior is problematic behavior and could
> > confuse users, we can do something like improving the
> > invalidation/termination reports. Or we can do nothing if the current
> > reporting is fine.
>
> What do you mean exactly here?  An improvement in the reports done
> when the invalidations are kicked sounds separate to me, that may not
> be something to touch when it comes to the back-branches.

I thought the commit 818fefd8fd4 originally came from the complaint
that we reported the cause of terminating the process could differ
from the cause of invalidating the slot (or even we didn't report
anything if we skipped the slot invalidation), so I proposed to report
explicitly that the slot's restart_lsn gets recovered and we therefore
skipped to invalidate it[1]. But I agree it's a separate discussion.

>
> Anyway, coming back to the patch, the argument of relying on the
> slot's restart_lsn and the two xmins to restore the pre-818fefd8fd4
> sounds good to me in light of 105b2cb3361.

BTW do you think restoring the pre-818fefd8fd4 behavior is going to be
backpatched?

>
> Extra argument: Is the fact that RS_INVAL_WAL_REMOVED (and optionally
> RS_INVAL_IDLE_TIMEOUT) is used only by the checkpointer something that
> we may want to document in the shape of an assertion at the beginning
> of InvalidateObsoleteReplicationSlots() with an extra condition based
> on the backend type?  Based on the fact that restart_lsn could be
> updated across two loops of the invalidation logic done by the
> checkpointer, the answer to my own question is "no" and we may want to
> trigger this reason from other contexts.  Just wondering if that's
> worth doing, as it has been mentioned upthread.

I don't think we need to put the assertion there, but it would be good
if we could mention somewhere as comments why it's safe to skip the
slot invalidation when the slot's xmin or restart_lsn get advanced
across two loops.

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoBvL8teXPEXK67_DaUsrft16QQ6XyGvuZVCnT-o8nH2ZA%40mail.gmail.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Improve docs syntax checking and enable it in the meson build
Next
From: Arne Roland
Date:
Subject: Re: apply_scanjoin_target_to_paths and partitionwise join