Re: Introduce XID age and inactive timeout based replication slot invalidation - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Introduce XID age and inactive timeout based replication slot invalidation
Date
Msg-id CAA4eK1+C3QxAZ6UtpSn9umGB33YjtYJgxA_xVXCacATEfZG4YQ@mail.gmail.com
Whole thread Raw
In response to Re: Introduce XID age and inactive timeout based replication slot invalidation  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Introduce XID age and inactive timeout based replication slot invalidation
List pgsql-hackers
On Fri, Mar 8, 2024 at 10:42 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > You might want to consider its interaction with sync slots on standby.
> > Say, there is no activity on slots in terms of processing the changes
> > for slots. Now, we won't perform sync of such slots on standby showing
> > them inactive as per your new criteria where as same slots could still
> > be valid on primary as the walsender is still active. This may be more
> > of a theoretical point as in running system there will probably be
> > some activity but I think this needs some thougths.
>
> I believe the xmin and catalog_xmin of the sync slots on the standby
> keep advancing depending on the slots on the primary, no? If yes, the
> XID age based invalidation shouldn't be a problem.
>
> I believe there are no walsenders started for the sync slots on the
> standbys, right? If yes, the inactive timeout based invalidation also
> shouldn't be a problem. Because, the inactive timeouts for a slot are
> tracked only for walsenders because they are the ones that typically
> hold replication slots for longer durations and for real replication
> use. We did a similar thing in a recent commit [1].
>
> Is my understanding right?
>

Yes, your understanding is correct. I wanted us to consider having new
parameters like 'inactive_replication_slot_timeout' to be at
slot-level instead of GUC. I think this new parameter doesn't seem to
be the similar as 'max_slot_wal_keep_size' which leads to truncation
of WAL at global and then invalidates the appropriate slots. OTOH, the
'inactive_replication_slot_timeout' doesn't appear to have a similar
global effect. The other thing we should consider is what if the
checkpoint happens at a timeout greater than
'inactive_replication_slot_timeout'? Shall, we consider doing it via
some other background process or do we think checkpointer is the best
we can have?

>
 Do you still see any problems with it?
>

Sorry, I haven't done any detailed review yet so can't say with
confidence whether there is any problem or not w.r.t sync slots.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: "Li, Yong"
Date:
Subject: Re: Proposal to add page headers to SLRU pages
Next
From: Amit Kapila
Date:
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation