On Mon, Jun 17, 2024 at 05:55:04PM +0530, Bharath Rupireddy wrote:
> Here are my thoughts on when to do the XID age invalidation. In all
> the patches sent so far, the XID age invalidation happens in two
> places - one during the slot acquisition, and another during the
> checkpoint. As the suggestion is to do it during the vacuum (manual
> and auto), so that even if the checkpoint isn't happening in the
> database for whatever reasons, a vacuum command or autovacuum can
> invalidate the slots whose XID is aged.
+1. IMHO this is a principled choice. The similar max_slot_wal_keep_size
parameter is considered where it arguably matters most: when we are trying
to remove/recycle WAL segments. Since this parameter is intended to
prevent the server from running out of space, it makes sense that we'd
apply it at the point where we are trying to free up space. The proposed
max_slot_xid_age parameter is intended to prevent the server from running
out of transaction IDs, so it follows that we'd apply it at the point where
we reclaim them, which happens to be vacuum.
> An idea is to check for XID age based invalidation for all the slots
> in ComputeXidHorizons() before it reads replication_slot_xmin and
> replication_slot_catalog_xmin, and obviously before the proc array
> lock is acquired. A potential problem with this approach is that the
> invalidation check can become too aggressive as XID horizons are
> computed from many places.
>
> Another idea is to check for XID age based invalidation for all the
> slots in higher levels than ComputeXidHorizons(), for example in
> vacuum() which is an entry point for both vacuum command and
> autovacuum. This approach seems similar to vacuum_failsafe_age GUC
> which checks each relation for the failsafe age before vacuum gets
> triggered on it.
I don't presently have any strong opinion on where this logic should go,
but in general, I think we should only invalidate slots if invalidating
them would allow us to advance the vacuum cutoff. If the cutoff is held
back by something else, I don't see a point in invalidating slots because
we'll just be breaking replication in return for no additional reclaimed
transaction IDs.
--
nathan