Re: Introduce XID age based replication slot invalidation - Mailing list pgsql-hackers
| From | Bharath Rupireddy |
|---|---|
| Subject | Re: Introduce XID age based replication slot invalidation |
| Date | |
| Msg-id | CALj2ACX_o+dKeAaK76mpAtG646UnDHpGUWziUkCvicVz8mz6=A@mail.gmail.com Whole thread Raw |
| In response to | Re: Introduce XID age based replication slot invalidation (SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com>) |
| Responses |
Re: Introduce XID age based replication slot invalidation
|
| List | pgsql-hackers |
Hi,
On Fri, Mar 20, 2026 at 11:29 PM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:
>
> Do you think we need different GUCs for catalog_xmin and xmin? If table bloat is a concern (not catalog bloat), then
logicalslots are not required to invalidate unless the cluster is close to wraparound.
IMO the main purpose of max_slot_xid_age is to prevent XID wraparound.
For bloat, I still think max_slot_wal_keep_size is the better choice.
Where max_slot_xid_age is really useful is when the vacuum can't
freeze because a replication slot (physical or logical) is holding
back the XID horizon and the system is getting close to wraparound.
Invalidating such a slot clears the way for vacuum. Setting
max_slot_xid_age above vacuum_failsafe_age allows vacuum to waste
cycles scanning tables it cannot freeze. Keeping max_slot_xid_age <=
vacuum_failsafe_age (default 1.6B) prevents this by invalidating the
slot before vacuum effort is wasted.
As far as XID wraparound is concerned, both xmin and catalog_xmin need
to be treated similarly. Either one can hold back freezing and push
the system toward wraparound. So I don't think we need separate GUCs
for xmin and catalog_xmin unless I'm missing something. One GUC
covering both keeps things simple.
>> I made the following design choice: try invalidating only once per
>> vacuum cycle, not per table. While this keeps the cost of checking
>> (incl. the XidGenLock contention) for invalidation to a minimum when
>> there are a large number of tables and replication slots, it can be
>> less effective when individual tables/indexes are large. Invalidating
>> during checkpoints can help to some extent with the large table/index
>> cases. But I'm open to thoughts on this.
>
> It may not solve the intent when the vacuum cycle is longer, which one can expect on a large database particularly
whenthere is heavy bloat.
This design choice boils down to the following: a database instance
having either 1/ a large number of small tables or 2/ large tables.
From my experience, I have seen both cases but mostly case 2 (others
can correct me). In this context, having an XID age based slot
invalidation check once per relation makes sense. However, I'm open to
more thoughts here.
>> Please find the attached patch for further review. I fixed the XID age
>> calculation in ReplicationSlotIsXIDAged and adjusted the code
>> comments.
>
> I applied the patch and all the tests passed. A few comments:
Thank you for reviewing the patch.
> @@ -495,7 +525,7 @@ vacuum(List *relations, const VacuumParams params, BufferAccessStrategy bstrateg
> MemoryContext vac_context, bool isTopLevel)
> {
> static bool in_vacuum = false;
> -
> + static bool first_time = true;
>
> first_time variable is not self explanatory, maybe something like try_replication_slot_invalidation and add comments
thatit will be set to false after the first check?
+1. Changed the variable name and simplified the comments around.
> + if (TransactionIdIsValid(xmin))
> + appendStringInfo(&err_detail, _("The slot's xmin %u exceeds the maximum xid age %d specified by
\"max_slot_xid_age\"."),
> + xmin,
> + max_slot_xid_age);
>
> Slot invalidates even when the age is max_slot_xid_age, isn't it?
Nice catch! I changed it to use TransactionIdPrecedes so it matches
the above error message like the two of the existing XID age GUCs
(autovacuum_freeze_max_age, vacuum_failsafe_age).
Please find the attached v2 patch for further review. Thank you!
--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com
Attachment
pgsql-hackers by date: