RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers

From shiy.fnst@fujitsu.com
Subject RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date
Msg-id OSZPR01MB631004289955C167B110F674FDDD9@OSZPR01MB6310.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Önder Kalacı <onderkalaci@gmail.com>)
Responses RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
List pgsql-hackers
On Thu, Feb 2, 2023 4:34 PM Önder Kalacı <onderkalaci@gmail.com> wrote:
>
>>
>> and if there's more
>> than one candidate index pick any one. Additionally, we can allow
>> disabling the use of an index scan for this particular case. If we are
>> too worried about API change for allowing users to specify the index
>> then we can do that later or as a separate patch.
>>
>
> On v23, I dropped the planner support for picking the index. Instead, it simply
> iterates over the indexes and picks the first one that is suitable.
>
> I'm currently thinking on how to enable users to override this decision.
> One option I'm leaning towards is to add a syntax like the following:
>
> ALTER SUBSCRIPTION .. ALTER TABLE ... SET INDEX ...
>
> Though, that should probably be a seperate patch. I'm going to work
> on that, but still wanted to share v23 given picking the index sounds
> complementary, not strictly required at this point.
>

Thanks for your patch. Here are some comments.

1.
I noticed that get_usable_indexoid() is called in apply_handle_update_internal()
and apply_handle_delete_internal() to get the usable index. Could usableIndexOid
be a parameter of these two functions? Because we have got the
LogicalRepRelMapEntry when calling them and if we do so, we can get
usableIndexOid without get_usable_indexoid(). Otherwise for partitioned tables,
logicalrep_partition_open() is called in get_usable_indexoid() and searching
the entry via hash_search() will increase cost.

2.
+             * This attribute is an expression, and
+             * SuitableIndexPathsForRepIdentFull() was called earlier when the
+             * index for subscriber was selected. There, the indexes
+             * comprising *only* expressions have already been eliminated.

The comment looks need to be updated:
SuitableIndexPathsForRepIdentFull
->
FindUsableIndexForReplicaIdentityFull

3.

     /* Build scankey for every attribute in the index. */
-    for (attoff = 0; attoff < IndexRelationGetNumberOfKeyAttributes(idxrel); attoff++)
+    for (index_attoff = 0; index_attoff < IndexRelationGetNumberOfKeyAttributes(idxrel);
+         index_attoff++)
     {

Should the comment be changed? Because we skip the attributes that are expressions.

4.
+            Assert(RelationGetReplicaIndex(rel) != RelationGetRelid(idxrel) &&
+                   RelationGetPrimaryKeyIndex(rel) != RelationGetRelid(idxrel));

Maybe we can call the new function idxIsRelationIdentityOrPK()?

Regards,
Shi Yu

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Helper functions for wait_for_catchup() in Cluster.pm
Next
From: David Rowley
Date:
Subject: Re: Can we do something to help stop users mistakenly using force_parallel_mode?