Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers

From Önder Kalacı
Subject Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date
Msg-id CACawEhVujhCCsrDhNHPLfHr7tWzkj-Ar8q41+uQ+4DFjJKknYw@mail.gmail.com
Whole thread Raw
In response to RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  ("kuroda.hayato@fujitsu.com" <kuroda.hayato@fujitsu.com>)
Responses RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
List pgsql-hackers
Hi Hayato Kuroda,

Thanks for the review!


====
01 relation.c - general

Many files are newly included.
I was not sure but some codes related with planner may be able to move to src/backend/optimizer/plan.
How do you and any other one think?


My thinking on those functions is that they should probably stay in src/backend/replication/logical/relation.c. My main motivation is that those functions are so much tailored to the purposes of this file that I cannot see any use-case for these functions in any other context.

Still, at some point, I considered maybe doing something similar to src/backend/executor/execReplication.c, where I create a new file, say, src/backend/optimizer/plan/planReplication.c or such as you noted. I'm a bit torn on this. 

Does anyone have any strong opinions for moving to src/backend/optimizer/plan/planReplication.c? (or another name)
 
02 relation.c - FindLogicalRepUsableIndex

```
+/*
+ * Returns an index oid if we can use an index for the apply side. If not,
+ * returns InvalidOid.
+ */
+static Oid
+FindLogicalRepUsableIndex(Relation localrel, LogicalRepRelation *remoterel)
```

I grepped files, but I cannot find the word "apply side". How about "subscriber" instead?

Yes, it makes sense. I guess I made up the "apply side" as there is the concept of "apply worker". But, yes, subscribers sound better, updated.
 

03 relation.c - FindLogicalRepUsableIndex

```
+       /* Simple case, we already have an identity or pkey */
+       idxoid = GetRelationIdentityOrPK(localrel);
+       if (OidIsValid(idxoid))
+               return idxoid;
+
+       /*
+        * If index scans are disabled, use a sequential scan.
+        *
+        * Note that we still allowed index scans above when there is a primary
+        * key or unique index replica identity, but that is the legacy behaviour
+        * (even when enable_indexscan is false), so we hesitate to move this
+        * enable_indexscan check to be done earlier in this function.
+        */
+       if (!enable_indexscan)
+               return InvalidOid;
```

a.
I think "identity or pkey" should be "replica identity key or primary key" or "RI or PK"

Looking into other places, it seems "replica identity index" is favored over "replica identity key". So, I used that term.

You can see this pattern in RelationGetReplicaIndex()
 

b.
Later part should be at around GetRelationIdentityOrPK.

Hmm, I cannot follow this comment. Can you please clarify?
 


04 relation.c - FindUsableIndexForReplicaIdentityFull

```
+       MemoryContext usableIndexContext;
...
+       usableIndexContext = AllocSetContextCreate(CurrentMemoryContext,
+                                                                                          "usableIndexContext",
+                                                                                          ALLOCSET_DEFAULT_SIZES);
```

I grepped other sources, and I found that the name like "tmpcxt" is used for the temporary MemoryContext.

I think there are also several contextes that are named more specifically, such as new_pdcxt, perTupCxt, anl_context, cluster_context and many others.

So, I think it is better to have specific names, no?
 

05 relation.c - SuitablePathsForRepIdentFull

```
+                       indexRelation = index_open(idxoid, AccessShareLock);
+                       indexInfo = BuildIndexInfo(indexRelation);
+                       is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
+                       is_partial = (indexInfo->ii_Predicate != NIL);
+                       is_only_on_expression = IndexOnlyOnExpression(indexInfo);
+                       index_close(indexRelation, NoLock);
```

Why the index is closed with NoLock? AccessShareLock is acquired, so shouldn't same lock be released?

Hmm, yes you are right. Keeping the lock seems unnecessary and wrong. It could actually have prevented dropping an index. However, given that RelationFindReplTupleByIndex() also closes this index at the end, the apply worker releases the lock. Hence, no problem observed.

Anyway, I'm still changing it to releasing the lock.

Also note that as soon as any index is dropped on the relation, the cache is invalidated and suitable indexes are re-calculated. That's why it seems fine to release the lock. 
 


06 relation.c - GetCheapestReplicaIdentityFullPath

IIUC a query like "SELECT tbl WHERE attr1 = $1 AND attr2 = $2 ... AND attrN = $N" is emulated, right?
you can write explicitly it as comment


The inlined comment in the function has a similar comment. Is that clear enough?

/* * Generate restrictions for all columns in the form of col_1 = $1 AND * col_2 = $2 ... */
 
07 relation.c - GetCheapestReplicaIdentityFullPath

```
+               Path       *path = (Path *) lfirst(lc);
+               Oid                     idxoid = GetIndexOidFromPath(path);
+
+               if (!OidIsValid(idxoid))
+               {
+                       /* Unrelated Path, skip */
+                       suitableIndexList = lappend(suitableIndexList, path);
+               }
```

I was not clear about here. IIUC in the function we want to extract "good" scan plan and based on that the cheapest one is chosen.
GetIndexOidFromPath() seems to return InvalidOid when the input path is not index scan, so why is it appended to the suitable list?


It could be a sequential scan that we have fall-back. However, we already add the sequential scan at the end of the function. So, actually you are right, there is no need to keep any other paths here. Adjusted the comments.
 

===
08 worker.c - usable_indexoid_internal

I think this is not "internal" function, such name should be used for like "apply_handle_commit" - "apply_handle_commit_internal", or "apply_handle_insert" - "apply_handle_insert_internal".
How about "get_usable_index" or something?

Yeah, you are right. I use this function inside functions ending with _internal, but this one is clearly not an internal function. I used get_usable_indexoid().
 

09 worker.c - usable_indexoid_internal

```
+       Oid                     targetrelid = targetResultRelInfo->ri_RelationDesc->rd_rel->oid;
+       Oid                     localrelid = relinfo->ri_RelationDesc->rd_id;
+
+       if (targetrelid != localrelid)
```

I think these lines are very confusable.
IIUC targetrelid is corresponded to the "parent", and localrelid is corresponded to the "child", right?
How about changing name to "partitionedoid" and "leadoid" or something?

We do not know whether targetrelid is definitely a "parent". But, if that is a parent, this function fetches the relevant partition's usableIndexOid. So, I'm not convinced that "parent" is a good choice.

Though, I agree that we can improve the code a bit. I now use targetrelkind and dropped localrelid to check whether the target is a partitioned table. Is this better?
 
 

===
10 032_subscribe_use_index.pl

```
# create tables pub and sub
$node_publisher->safe_psql('postgres',
        "CREATE TABLE test_replica_id_full (x int)");
$node_publisher->safe_psql('postgres',
        "ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL;");
$node_subscriber->safe_psql('postgres',
        "CREATE TABLE test_replica_id_full (x int)");
$node_subscriber->safe_psql('postgres',
        "CREATE INDEX test_replica_id_full_idx ON test_replica_id_full(x)");
```

In many places same table is defined, altered as "REPLICA IDENTITY FULL", and index is created.
Could you combine them into function?

Well, I'm not sure if it is worth the complexity. There are only 4 usages of the same table, and these are all pretty simple statements, and all other tests seem to have a similar pattern. I have not seen any tests where these simple statements are done in a function even if they are repeated. I'd rather keep it so that this doesn't lead to other style discussions?
 

11 032_subscribe_use_index.pl

```
# wait until the index is used on the subscriber
$node_subscriber->poll_query_until(
        'postgres', q{select (idx_scan = 1) from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';}
) or die "Timed out while waiting for check subscriber tap_sub_rep_full_0 updates one row via index";
```

In many places this check is done. Could you combine them into function?

I'm a little confused. Isn't that already inside a function (e.g., poll_query_until) ? Can you please clarify this suggestion a bit more? 
 

12 032_subscribe_use_index.pl

```
# create pub/sub
$node_publisher->safe_psql('postgres',
        "CREATE PUBLICATION tap_pub_rep_full FOR TABLE test_replica_id_full");
$node_subscriber->safe_psql('postgres',
        "CREATE SUBSCRIPTION tap_sub_rep_full CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_rep_full"
);
```

Same as above

Well, again, I'm not sure if it is worth moving these simple statements to functions as an improvement here. One might tell that it is better to see the statements explicitly on the test -- which almost all the tests do. I want to avoid introducing some unusual pattern to the tests.
 

13 032_subscribe_use_index.pl

```
# cleanup pub
$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_rep_full");
$node_publisher->safe_psql('postgres', "DROP TABLE test_replica_id_full");
# cleanup sub
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_rep_full");
$node_subscriber->safe_psql('postgres', "DROP TABLE test_replica_id_full");
```

Same as above

Same as above :) 
 

14 032_subscribe_use_index.pl - SUBSCRIPTION USES INDEX

```
# make sure that the subscriber has the correct data
my $result = $node_subscriber->safe_psql('postgres',
        "SELECT sum(x) FROM test_replica_id_full");
is($result, qq(212), 'ensure subscriber has the correct data at the end of the test');

$node_subscriber->poll_query_until(
        'postgres', q{select sum(x)=212 AND count(*)=21 AND count(DISTINCT x)=20 FROM test_replica_id_full;}
) or die "ensure subscriber has the correct data at the end of the test";
```

I think first one is not needed.

I preferred to keep the second one because is($result, .. is needed for tests to show the progress while running.
 


15 032_subscribe_use_index.pl - SUBSCRIPTION USES INDEX UPDATEs MULTIPLE ROWS

```
# insert some initial data within the range 0-20
$node_publisher->safe_psql('postgres',
        "INSERT INTO test_replica_id_full SELECT i%20 FROM generate_series(0,1000)i;"
);
```

I think data is within the range 0-19.
(There are some mistakes)

Yes, I fixed it all.

 

===
16 test/subscription/meson.build

Your test 't/032_subscribe_use_index.pl' must be added in the 'tests' for meson build system.
(I checked on my env, and your test works well)

 
Oh, I didn't know about this, thanks!

Attached v14.

Thanks,
Onder
 
Attachment

pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: Avoid memory leaks during base backups
Next
From: Önder Kalacı
Date:
Subject: Re: A potential memory leak on Merge Join when Sort node is not below Materialize node