Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers

From Önder Kalacı
Subject Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date
Msg-id CACawEhXgSkPQoivkVSb8LYTr+e8j+t9rO2V_tPWH5K-e7zdrBw@mail.gmail.com
Whole thread Raw
In response to RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  ("kuroda.hayato@fujitsu.com" <kuroda.hayato@fujitsu.com>)
Responses RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
List pgsql-hackers
Hi Hayato Kuroda, 

Thanks for the review, please see my reply below:


===
For execRelation.c

01. RelationFindReplTupleByIndex()

```
        /* Start an index scan. */
        InitDirtySnapshot(snap);
-       scan = index_beginscan(rel, idxrel, &snap,
-                                                  IndexRelationGetNumberOfKeyAttributes(idxrel),
-                                                  0);

        /* Build scan key. */
-       build_replindex_scan_key(skey, rel, idxrel, searchslot);
+       scankey_attoff = build_replindex_scan_key(skey, rel, idxrel, searchslot);

+       scan = index_beginscan(rel, idxrel, &snap, scankey_attoff, 0);
```

I think "/* Start an index scan. */" should be just above index_beginscan().

moved there
 

===
For worker.c

02. sable_indexoid_internal()

```
+ * Note that if the corresponding relmapentry has InvalidOid usableIndexOid,
+ * the function returns InvalidOid.
+ */
+static Oid
+usable_indexoid_internal(ApplyExecutionData *edata, ResultRelInfo *relinfo)
```

"InvalidOid usableIndexOid" should be "invalid usableIndexOid,"

makes sense, updated
 

03. check_relation_updatable()

```
         * We are in error mode so it's fine this is somewhat slow. It's better to
         * give user correct error.
         */
-       if (OidIsValid(GetRelationIdentityOrPK(rel->localrel)))
+       if (OidIsValid(rel->usableIndexOid))
        {
```

Shouldn't we change the above comment to? The check is no longer slow.

Hmm, I couldn't realize this comment earlier. So you suggest "slow" here refers to the additional function call "GetRelationIdentityOrPK"? If so, yes I'll update that.
 

===
For relation.c

04. GetCheapestReplicaIdentityFullPath()

```
+static Path *
+GetCheapestReplicaIdentityFullPath(Relation localrel)
+{
+       PlannerInfo *root;
+       Query      *query;
+       PlannerGlobal *glob;
+       RangeTblEntry *rte;
+       RelOptInfo *rel;
+       int     attno;
+       RangeTblRef *rt;
+       List *joinList;
+       Path *seqScanPath;
```

I think the part that constructs dummy-planner state can be move to another function
because that part is not meaningful for this.
Especially line 824-846 can.


Makes sense, simplified the function. Though, it is always hard to pick good names for these kinds of helper functions. I picked GenerateDummySelectPlannerInfoForRelation(), does that sound good to you as well? 
 

===
For 032_subscribe_use_index.pl

05. general

```
+# insert some initial data within the range 0-1000
+$node_publisher->safe_psql('postgres',
+       "INSERT INTO test_replica_id_full SELECT i%20 FROM generate_series(0,1000)i;"
+);
```

It seems that the range of initial data seems [0, 19].
Same mistake-candidates are found many place.

Ah, several copy & paste errors. Fixed (hopefully) all.
 

06. general

```
+# updates 1000 rows
+$node_publisher->safe_psql('postgres',
+       "UPDATE test_replica_id_full SET x = x + 1 WHERE x = 15;");
```

Only 50 tuples are modified here.
Same mistake-candidates are found many place.

Alright, yes there were several wrong comments in the tests. I went over the tests once more to fix those and improve comments.
 

07. general

```
+# we check if the index is used or not
+$node_subscriber->poll_query_until(
+       'postgres', q{select (idx_scan = 200) from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';}
+) or die "Timed out while waiting for check subscriber tap_sub_rep_full_3 updates 200 rows via index";
```
The query will be executed until the index scan is finished, but it may be not commented.
How about changing it to "we wait until the index used on the subscriber-side." or something?
Same comments are found in many place.

Makes sense, updated
 

08. test related with ANALYZE

```
+# Testcase start: SUBSCRIPTION CAN UPDATE THE INDEX IT USES AFTER ANALYZE - PARTITIONED TABLE
+# ====================================================================
```

"Testcase start:" should be "Testcase end:" here.

thanks, fixed
 

09. general

In some tests results are confirmed but in other test they are not.
I think you can make sure results are expected in any case if there are no particular reasons.


Alright, yes I also don't see a reason not to do that. Added to all cases.


I'll attach the patch with the next email as I also want to incorporate the other comments. Hope this is not going to be confusing.

Thanks,
Onder

pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: Pruning never visible changes
Next
From: Önder Kalacı
Date:
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher