RE: Introduce XID age and inactive timeout based replication slot invalidation - Mailing list pgsql-hackers

From Zhijie Hou (Fujitsu)
Subject RE: Introduce XID age and inactive timeout based replication slot invalidation
Date
Msg-id OS0PR01MB5716474D71D8A637B9BB082A94F02@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Introduce XID age and inactive timeout based replication slot invalidation  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Friday, February 7, 2025 9:06 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> 
> Attached v72 patches, addressed the above comments as well as Vignesh's
> comments in [2].
>  - There are no new changes in patch-002.

Thanks for updating the patch, I have few review comments:

1.
> InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,

I think the type of first parameter 'cause' is not appropriate anymore since
it's now a bitmap flag instead of an enum.

2.
> -StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
> -                 "array length mismatch");
> +#define    RS_INVAL_MAX_CAUSES (sizeof(InvalidationCauses) / sizeof(InvalidationCauses[0]))

I'd like to confirm if the current value of the RS_INVAL_MAX_CAUSES is correct.
Previously, the value is arrary_length - 1, while now it seems equal to the
arrary_length.

And ISTM we could directly call lengthof() here.

3.

+            if (cause & RS_INVAL_HORIZON)
+            {
+                if (!SlotIsLogical(s))
+                    goto invalidation_marked;

I am not sure if this logic is correct. Even if the slot would not be
invalidated due to RS_INVAL_HORIZON, we should continue to check other causes.

Besides, instead of using a goto, I personally prefer to move all these codes
into a separate function which would return a single invalidation cause.

4.
-    if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+    if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
                                            _logSegNo, InvalidOid,
                                            InvalidTransactionId))

I think this change could trigger an unnecessary WAL position re-calculation when
slots are invalidated only due to RS_INVAL_IDLE_TIMEOUT. But since it would not be
a frequent operation so I am OK to leave it unless we have better ideas.

Best Regards,
Hou zj

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Question about UpdateFullPageWrites() at the end of recovery.
Next
From: Julien Rouhaud
Date:
Subject: Re: proposal - plpgsql - support standard syntax for named arguments for cursors