RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers

From shiy.fnst@fujitsu.com
Subject RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date
Msg-id OSZPR01MB6310C166D4E56B521C61984AFD709@OSZPR01MB6310.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Önder Kalacı <onderkalaci@gmail.com>)
Responses Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
List pgsql-hackers
On Sat, Aug 20, 2022 7:02 PM Önder Kalacı <onderkalaci@gmail.com> wrote:
> Hi,
> 
> I'm a little late to catch up with your comments, but here are my replies:

Thanks for your patch. Here are some comments.

1.
In FilterOutNotSuitablePathsForReplIdentFull(), is "nonPartialIndexPathList" a
good name for the list? Indexes on only expressions are also be filtered.

+static List *
+FilterOutNotSuitablePathsForReplIdentFull(List *pathlist)
+{
+    ListCell   *lc;
+    List *nonPartialIndexPathList = NIL;

2.
+typedef struct LogicalRepPartMapEntry
+{
+    Oid            partoid;        /* LogicalRepPartMap's key */
+    LogicalRepRelMapEntry relmapentry;
+    Oid            usableIndexOid; /* which index to use? (Invalid when no index
+                                 * used) */
+} LogicalRepPartMapEntry;

For partition tables, is it possible to use relmapentry->usableIndexOid to mark
which index to use? Which means we don't need to add "usableIndexOid" to
LogicalRepPartMapEntry.

3.
It looks we should change the comment for FindReplTupleInLocalRel() in this
patch.

/*
 * Try to find a tuple received from the publication side (in 'remoteslot') in
 * the corresponding local relation using either replica identity index,
 * primary key or if needed, sequential scan.
 *
 * Local tuple, if found, is returned in '*localslot'.
 */
static bool
FindReplTupleInLocalRel(EState *estate, Relation localrel,

4.
@@ -2030,16 +2017,19 @@ apply_handle_delete_internal(ApplyExecutionData *edata,
 {
     EState       *estate = edata->estate;
     Relation    localrel = relinfo->ri_RelationDesc;
-    LogicalRepRelation *remoterel = &edata->targetRel->remoterel;
+    LogicalRepRelMapEntry *targetRel = edata->targetRel;
+    LogicalRepRelation *remoterel = &targetRel->remoterel;
     EPQState    epqstate;
     TupleTableSlot *localslot;

Do we need this change? I didn't see any place to use the variable targetRel
afterwards.

5.
+        if (!AttributeNumberIsValid(mainattno))
+        {
+            /*
+             * There are two cases to consider. First, if the index is a primary or
+             * unique key, we cannot have any indexes with expressions. So, at this
+             * point we are sure that the index we deal is not these.
+             */
+            Assert(RelationGetReplicaIndex(rel) != RelationGetRelid(idxrel) &&
+                   RelationGetPrimaryKeyIndex(rel) != RelationGetRelid(idxrel));
+
+            /*
+             * For a non-primary/unique index with an expression, we are sure that
+             * the expression cannot be used for replication index search. The
+             * reason is that we create relevant index paths by providing column
+             * equalities. And, the planner does not pick expression indexes via
+             * column equality restrictions in the query.
+             */
+            continue;
+        }

Is it possible that it is a usable index with an expression? I think indexes
with an expression has been filtered in 
FilterOutNotSuitablePathsForReplIdentFull(). If it can't be a usable index with
an expression, maybe we shouldn't use "continue" here.

6.
In the following case, I got a result which is different from HEAD, could you
please look into it?

-- publisher
CREATE TABLE test_replica_id_full (x int); 
ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL; 
CREATE PUBLICATION tap_pub_rep_full FOR TABLE test_replica_id_full;

-- subscriber
CREATE TABLE test_replica_id_full (x int, y int); 
CREATE INDEX test_replica_id_full_idx ON test_replica_id_full(x,y); 
CREATE SUBSCRIPTION tap_sub_rep_full_0 CONNECTION 'dbname=postgres port=5432' PUBLICATION tap_pub_rep_full;

-- publisher
INSERT INTO test_replica_id_full VALUES (1);
UPDATE test_replica_id_full SET x = x + 1 WHERE x = 1;

The data in subscriber:
on HEAD:
postgres=# select * from test_replica_id_full ;
 x | y
---+---
 2 |
(1 row)

After applying the patch:
postgres=# select * from test_replica_id_full ;
 x | y
---+---
 1 |
(1 row)

Regards,
Shi yu

pgsql-hackers by date:

Previous
From: Dong Wook Lee
Date:
Subject: pg_basebackup: add test about zstd compress option
Next
From: Thomas Munro
Date:
Subject: Re: sockaddr_un.sun_len vs. reality