Re: Introduce XID age and inactive timeout based replication slot invalidation - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: Introduce XID age and inactive timeout based replication slot invalidation
Date
Msg-id ZfvLbCfp4oH9iFEL@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: Introduce XID age and inactive timeout based replication slot invalidation  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Introduce XID age and inactive timeout based replication slot invalidation
List pgsql-hackers
Hi,

On Thu, Mar 21, 2024 at 08:47:18AM +0530, Amit Kapila wrote:
> On Wed, Mar 20, 2024 at 1:51 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote:
> > >
> > > 2. last_inactive_at and inactive_timeout are now tracked in on-disk
> > > replication slot data structure.
> >
> > Should last_inactive_at be tracked on disk? Say the engine is down for a period
> > of time > inactive_timeout then the slot will be invalidated after the engine
> > re-start (if no activity before we invalidate the slot). Should the time the
> > engine is down be counted as "inactive" time? I've the feeling it should not, and
> > that we should only take into account inactive time while the engine is up.
> >
> 
> Good point. The question is how do we achieve this without persisting
> the 'last_inactive_at'? Say, 'last_inactive_at' for a particular slot
> had some valid value before we shut down but it still didn't cross the
> configured 'inactive_timeout' value, so, we won't be able to
> invalidate it. Now, after the restart, as we don't know the
> last_inactive_at's value before the shutdown, we will initialize it
> with 0 (this is what Bharath seems to have done in the latest
> v13-0002* patch). After this, even if walsender or backend never
> acquires the slot, we won't invalidate it. OTOH, if we track
> 'last_inactive_at' on the disk, after, restart, we could initialize it
> to the current time if the value is non-zero. Do you have any better
> ideas?
> 

I think that setting last_inactive_at when we restart makes sense if the slot
has been active previously. I think the idea is because it's holding xmin/catalog_xmin
and that we don't want to prevent rows removal longer that the timeout.

So what about relying on xmin/catalog_xmin instead that way?

- For physical slots if xmin is set then set last_inactive_at to the current
time at restart (else zero).

- For logical slot, it's not the same as the catalog_xmin is set at the slot
creation time. So what about setting last_inactive_at at the current time at 
restart but also at creation time for logical slot? (Setting it to zero at
creation time (as we do in v13) does not look right, given the fact that it's
"already" holding a catalog_xmin).

That way, we'd ensure that we are not holding rows for longer that the timeout
and we don't need to persist last_inactive_at.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Trying to build x86 version on windows using meson
Next
From: Bertrand Drouvot
Date:
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation