Re: Introduce XID age and inactive timeout based replication slot invalidation - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Introduce XID age and inactive timeout based replication slot invalidation
Date
Msg-id CALj2ACX0FVrx9f0c2g19Rb-WAJhm3wdyttde3YQtkotL6bUveA@mail.gmail.com
Whole thread Raw
In response to Re: Introduce XID age and inactive timeout based replication slot invalidation  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Introduce XID age and inactive timeout based replication slot invalidation
Re: Introduce XID age and inactive timeout based replication slot invalidation
List pgsql-hackers
Hi,

On Sat, Apr 13, 2024 at 9:36 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> There was a point raised by Amit
> https://www.postgresql.org/message-id/CAA4eK1K8wqLsMw6j0hE_SFoWAeo3Kw8UNnMfhsWaYDF1GWYQ%2Bg%40mail.gmail.com
> on when to do the XID age based invalidation - whether in checkpointer
> or when vacuum is being run or whenever ComputeXIDHorizons gets called
> or in autovacuum process. For now, I've chosen the design to do these
> new invalidation checks in two places - 1) whenever the slot is
> acquired and the slot acquisition errors out if invalidated, 2) during
> checkpoint. However, I'm open to suggestions on this.

Here are my thoughts on when to do the XID age invalidation. In all
the patches sent so far, the XID age invalidation happens in two
places - one during the slot acquisition, and another during the
checkpoint. As the suggestion is to do it during the vacuum (manual
and auto), so that even if the checkpoint isn't happening in the
database for whatever reasons, a vacuum command or autovacuum can
invalidate the slots whose XID is aged.

An idea is to check for XID age based invalidation for all the slots
in ComputeXidHorizons() before it reads replication_slot_xmin and
replication_slot_catalog_xmin, and obviously before the proc array
lock is acquired. A potential problem with this approach is that the
invalidation check can become too aggressive as XID horizons are
computed from many places.

Another idea is to check for XID age based invalidation for all the
slots in higher levels than ComputeXidHorizons(), for example in
vacuum() which is an entry point for both vacuum command and
autovacuum. This approach seems similar to vacuum_failsafe_age GUC
which checks each relation for the failsafe age before vacuum gets
triggered on it.

Does anyone see any issues or risks with the above two approaches or
have any other ideas? Thoughts?

I attached v40 patches here. I reworded some of the ERROR messages,
and did some code clean-up. Note that I haven't implemented any of the
above approaches yet.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Using LibPq in TAP tests via FFI
Next
From: Chapman Flack
Date:
Subject: Re: ON ERROR in json_query and the like