Re: Introduce XID age and inactive timeout based replication slot invalidation - Mailing list pgsql-hackers

From vignesh C
Subject Re: Introduce XID age and inactive timeout based replication slot invalidation
Date
Msg-id CALDaNm0X_vgAxKPT+c14yqKcgE5-x4XBdXsCAVqD6_aa-QYUvg@mail.gmail.com
Whole thread Raw
In response to Re: Introduce XID age and inactive timeout based replication slot invalidation  (vignesh C <vignesh21@gmail.com>)
Responses Re: Introduce XID age and inactive timeout based replication slot invalidation
Re: Introduce XID age and inactive timeout based replication slot invalidation
List pgsql-hackers
On Tue, 4 Feb 2025 at 19:56, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> Here is v69 patch set addressing above and Kuroda-san's comments in [1].

Few minor suggestions:
1) In the slot invalidation reporting below:
+               case RS_INVAL_IDLE_TIMEOUT:
+                       Assert(inactive_since > 0);
+
+                       /* translator: second %s is a GUC variable name */
+                       appendStringInfo(&err_detail, _("The slot's
idle time %s exceeds the configured \"%s\" duration."),
+
timestamptz_to_str(inactive_since),
+
"idle_replication_slot_timeout");
+                       /* translator: %s is a GUC variable name */
+                       appendStringInfo(&err_hint, _("You might need
to increase \"%s\"."),
+
"idle_replication_slot_timeout");

It is logged like:
2025-02-05 10:04:11.616 IST [330567] DETAIL:  The slot's idle time
2025-02-05 10:02:49.131631+05:30 exceeds the configured
"idle_replication_slot_timeout" duration.

Here even though we tell idle time, we are logging the inactive_since
value which kind of gives a wrong meaning.

How about we change it to:
The slot has been inactive since 2025-02-05 10:02:49.131631+05:30,
which exceeds the configured "idle_replication_slot_timeout" duration.

2) Here we have mentioned about invalidation happens only for a)
released slots b) inactive slots replication slots c) slot where
communication between pub and sub is down
+                * XXX: Slot invalidation due to 'idle_timeout' applies only to
+                * released slots, and is based on the
'idle_replication_slot_timeout'
+                * GUC. Active slots currently in use for replication
are excluded to
+                * prevent accidental invalidation. Slots where
communication between
+                * the publisher and subscriber is down are also
excluded, as they are
+                * managed by the 'wal_sender_timeout'.
+                */
+               InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+
            0,
+
            InvalidOid,
+
            InvalidTransactionId);
a) Can we include about slots which does not reserve WAL are also not
considered.
b) Could we present this in a bullet-point format like the following:
+                * XXX: Slot invalidation due to 'idle_timeout' applies only to:
+                * 1) released slots, and is based on the
'idle_replication_slot_timeout'
+                * GUC. 2) Active slots currently in use for
replication are excluded to
+                * prevent accidental invalidation. 3) Slots where
communication between
+                * the publisher and subscriber is down are also
excluded, as they are
+                * managed by the 'wal_sender_timeout'.
+                */
c) While I was initially reviewing the patch I also had the similar
thoughts on my mind, if we could mention the one like "Slots where
communication between the publisher and subscriber is down are also
excluded, as they are managed by the 'wal_sender_timeout'" in the
documentation it might be good.

Regards,
Vignesh



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: new commitfest transition guidance
Next
From: Shlok Kyal
Date:
Subject: Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.