Re: Question about InvalidatePossiblyObsoleteSlot() - Mailing list pgsql-hackers
| From | Masahiko Sawada |
|---|---|
| Subject | Re: Question about InvalidatePossiblyObsoleteSlot() |
| Date | |
| Msg-id | CAD21AoAFn_uPGDXsmg42wUXRZ7_yhiUyDAa8WyCW14DRCPZqtg@mail.gmail.com Whole thread Raw |
| In response to | Re: Question about InvalidatePossiblyObsoleteSlot() (Michael Paquier <michael@paquier.xyz>) |
| Responses |
Re: Question about InvalidatePossiblyObsoleteSlot()
|
| List | pgsql-hackers |
On Wed, Oct 29, 2025 at 12:02 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Oct 28, 2025 at 11:53:26AM -0700, Masahiko Sawada wrote: > > On Tue, Oct 28, 2025 at 1:58 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Apologies for not following closely what has been happening on this > thread. I have just not been paying much attention, and Sawada-san > has just poked at me about its existence today, also regarding the > fact that my fingerprints are all over the place (implying that I'm an > owner here, even in light of 105b2cb3361 that has introduced the trick > to bypass the creation of the annoying standby snapshot records that > distabilized the tests). > > >> I see, you mean that the tests are stable now (thanks to 105b2cb3361) but > >> that we should still do something for "production" cases? (i.e not making use > >> of injection points). > > > > Yes. While it seems we might want to review the past discussion, if > > we've concluded such behavior is problematic behavior and could > > confuse users, we can do something like improving the > > invalidation/termination reports. Or we can do nothing if the current > > reporting is fine. > > What do you mean exactly here? An improvement in the reports done > when the invalidations are kicked sounds separate to me, that may not > be something to touch when it comes to the back-branches. I thought the commit 818fefd8fd4 originally came from the complaint that we reported the cause of terminating the process could differ from the cause of invalidating the slot (or even we didn't report anything if we skipped the slot invalidation), so I proposed to report explicitly that the slot's restart_lsn gets recovered and we therefore skipped to invalidate it[1]. But I agree it's a separate discussion. > > Anyway, coming back to the patch, the argument of relying on the > slot's restart_lsn and the two xmins to restore the pre-818fefd8fd4 > sounds good to me in light of 105b2cb3361. BTW do you think restoring the pre-818fefd8fd4 behavior is going to be backpatched? > > Extra argument: Is the fact that RS_INVAL_WAL_REMOVED (and optionally > RS_INVAL_IDLE_TIMEOUT) is used only by the checkpointer something that > we may want to document in the shape of an assertion at the beginning > of InvalidateObsoleteReplicationSlots() with an extra condition based > on the backend type? Based on the fact that restart_lsn could be > updated across two loops of the invalidation logic done by the > checkpointer, the answer to my own question is "no" and we may want to > trigger this reason from other contexts. Just wondering if that's > worth doing, as it has been mentioned upthread. I don't think we need to put the assertion there, but it would be good if we could mention somewhere as comments why it's safe to skip the slot invalidation when the slot's xmin or restart_lsn get advanced across two loops. Regards, [1] https://www.postgresql.org/message-id/CAD21AoBvL8teXPEXK67_DaUsrft16QQ6XyGvuZVCnT-o8nH2ZA%40mail.gmail.com -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: