Re: Question about InvalidatePossiblyObsoleteSlot() - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: Question about InvalidatePossiblyObsoleteSlot()
Date
Msg-id aPCriekyLghnKeNI@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: Question about InvalidatePossiblyObsoleteSlot()  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
Hi,

On Wed, Oct 15, 2025 at 04:24:03PM -0700, Masahiko Sawada wrote:
> On Tue, Oct 14, 2025 at 9:51 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > We don't really report an "invalidation", what we report is:
> >
> > LOG:  terminating process 3998707 to release replication slot "logical_slot"
> > DETAIL:  The slot's restart_lsn 0/00842480 exceeds the limit by 2874240 bytes.
> > HINT:  You might need to increase "max_slot_wal_keep_size".
> >
> > and we terminate the process:
> >
> > FATAL:  terminating connection due to administrator command
> >
> > We are not reporting:
> >
> > DETAIL:  This replication slot has been invalidated due to "wal_removed".
> >
> > and the slot is still valid.
> >
> > That's the pre 818fefd8fd4 behavior.
> 
> Thank you for the clarification! Understood.
> 
> > Ideally, I think that we should not report anything and not terminate the
> > process. I did not look at it, maybe we could look at it as a second step (first
> > step being to restore the pre 818fefd8fd4 behavior)?
> 
> I find that reporting of terminating a process having an possibly
> obsolete slot is fine, but reading some related threads[1][2] it seems
> to me that a problem we want to avoid is that we report "terminated"
> without leading to an "obsolete" message.

Right, the focus at that time was around the invalidations related to the
xmin conflicts (mainly because it felt unsafe to "ignore" the invalidation
for them). Then later the restart_lsn was added to the discussion [1].

But after more thought, I do think it's safe for the restart_lsn case
(for the reasons mentioned above).

> Does it make sense to report
> explicitly that the slot's restart_lsn gets recovered and we therefore
> skipped to invalidate it?

I'm not sure. The existing logging shows when we invalidate a slot, so the
absence of that message indicates we skipped it. Users can also verify the
slot status in the view if needed.

Regards,

[1]: https://www.postgresql.org/message-id/CALj2ACUo_rDTonZCkqdSnsc3tT5_cFcJQHSQrsyAYyO5MLO52A%40mail.gmail.com

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Shinya Kato
Date:
Subject: Re: pgstattuple: Use streaming read API in pgstatindex functions
Next
From: Peter Eisentraut
Date:
Subject: Re: [PROPOSAL] comments in repl_scanner