Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers

From Önder Kalacı
Subject Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date
Msg-id CACawEhXDxdFSC+FRLomS3u1=i4osSd8SrUj0xB1qvA63hbh+bQ@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (vignesh C <vignesh21@gmail.com>)
Responses Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Amit Kapila <amit.kapila16@gmail.com>)
Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (vignesh C <vignesh21@gmail.com>)
List pgsql-hackers
Hi Vignesh,

Thanks for the review


1) We are currently calling RelationGetIndexList twice, once in
FindUsableIndexForReplicaIdentityFull function and in the caller too,
we could avoid one of the calls by passing the indexlist to the
function or removing the check here, index list check can be handled
in FindUsableIndexForReplicaIdentityFull.
+       if (remoterel->replident == REPLICA_IDENTITY_FULL &&
+               RelationGetIndexList(localrel) != NIL)
+       {
+               /*
+                * If we had a primary key or relation identity with a
unique index,
+                * we would have already found and returned that oid.
At this point,
+                * the remote relation has replica identity full and
we have at least
+                * one local index defined.
+                *
+                * We are looking for one more opportunity for using
an index. If
+                * there are any indexes defined on the local
relation, try to pick
+                * a suitable index.
+                *
+                * The index selection safely assumes that all the
columns are going
+                * to be available for the index scan given that
remote relation has
+                * replica identity full.
+                */
+               return FindUsableIndexForReplicaIdentityFull(localrel);
+       }
+

makes sense, done
 

2) Copyright year should be mentioned as 2023
diff --git a/src/test/subscription/t/032_subscribe_use_index.pl
b/src/test/subscription/t/032_subscribe_use_index.pl
new file mode 100644
index 0000000000..db0a7ea2a0
--- /dev/null
+++ b/src/test/subscription/t/032_subscribe_use_index.pl
@@ -0,0 +1,861 @@
+# Copyright (c) 2021-2022, PostgreSQL Global Development Group
+
+# Test logical replication behavior with subscriber uses available index
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+

I changed it to #Copyright (c) 2022-2023, but I'm not sure if it should be only 2023 or 
like this.
 

3) Many of the tests are using the same tables, we need not
drop/create publication/subscription for each of the team, we could
just drop and create required indexes and verify the update/delete
statements.
+# ====================================================================
+# Testcase start: SUBSCRIPTION USES INDEX
+#
+# Basic test where the subscriber uses index
+# and only updates 1 row and deletes
+# 1 other row
+#
+
+# create tables pub and sub
+$node_publisher->safe_psql('postgres',
+       "CREATE TABLE test_replica_id_full (x int)");
+$node_publisher->safe_psql('postgres',
+       "ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL;");
+$node_subscriber->safe_psql('postgres',
+       "CREATE TABLE test_replica_id_full (x int)");
+$node_subscriber->safe_psql('postgres',
+       "CREATE INDEX test_replica_id_full_idx ON test_replica_id_full(x)");

+# ====================================================================
+# Testcase start: SUBSCRIPTION CREATE/DROP INDEX WORKS WITHOUT ISSUES
+#
+# This test ensures that after CREATE INDEX, the subscriber can automatically
+# use one of the indexes (provided that it fulfils the requirements).
+# Similarly, after DROP index, the subscriber can automatically switch to
+# sequential scan
+
+# create tables pub and sub
+$node_publisher->safe_psql('postgres',
+       "CREATE TABLE test_replica_id_full (x int NOT NULL, y int)");
+$node_publisher->safe_psql('postgres',
+       "ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL;");
+$node_subscriber->safe_psql('postgres',
+       "CREATE TABLE test_replica_id_full (x int NOT NULL, y int)");

Well, not all the tables are exactly the same, there are 4-5 different
tables. Mostly the table names are the same.

Plus, the overhead does not seem to be large enough to complicate
the test. Many of the src/test/subscription/t files follow this pattern. 

Do you have strong opinions on changing this? 


4) These additional blank lines can be removed to keep it consistent:
4.a)
+# Testcase end: SUBSCRIPTION DOES NOT USE PARTIAL INDEX
+# ====================================================================
+
+
+# ====================================================================
+# Testcase start: SUBSCRIPTION DOES NOT USE INDEXES WITH ONLY EXPRESSIONS

4.b)
+# Testcase end: Unique index that is not primary key or replica identity
+# ====================================================================
+
+
+
+# ====================================================================
+# Testcase start: SUBSCRIPTION BEHAVIOR WITH ENABLE_INDEXSCAN
 
Thanks, fixed.

Attached v30

Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Allow tailoring of ICU locales with custom rules
Next
From: Bharath Rupireddy
Date:
Subject: Re: Improve WALRead() to suck data directly from WAL buffers when possible