Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers

From vignesh C
Subject Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date
Msg-id CALDaNm1uAuO2RnER+5GGR4wTUo3U2+ZvS1bfnTndmGmQzUyszg@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Önder Kalacı <onderkalaci@gmail.com>)
Responses Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Önder Kalacı <onderkalaci@gmail.com>)
List pgsql-hackers
On Fri, 3 Mar 2023 at 18:40, Önder Kalacı <onderkalaci@gmail.com> wrote:
>
> Hi Vignesh,
>
> Thanks for the review
>
>>
>> 1) We are currently calling RelationGetIndexList twice, once in
>> FindUsableIndexForReplicaIdentityFull function and in the caller too,
>> we could avoid one of the calls by passing the indexlist to the
>> function or removing the check here, index list check can be handled
>> in FindUsableIndexForReplicaIdentityFull.
>> +       if (remoterel->replident == REPLICA_IDENTITY_FULL &&
>> +               RelationGetIndexList(localrel) != NIL)
>> +       {
>> +               /*
>> +                * If we had a primary key or relation identity with a
>> unique index,
>> +                * we would have already found and returned that oid.
>> At this point,
>> +                * the remote relation has replica identity full and
>> we have at least
>> +                * one local index defined.
>> +                *
>> +                * We are looking for one more opportunity for using
>> an index. If
>> +                * there are any indexes defined on the local
>> relation, try to pick
>> +                * a suitable index.
>> +                *
>> +                * The index selection safely assumes that all the
>> columns are going
>> +                * to be available for the index scan given that
>> remote relation has
>> +                * replica identity full.
>> +                */
>> +               return FindUsableIndexForReplicaIdentityFull(localrel);
>> +       }
>> +
>
>
> makes sense, done
>
>>
>>
>> 2) Copyright year should be mentioned as 2023
>> diff --git a/src/test/subscription/t/032_subscribe_use_index.pl
>> b/src/test/subscription/t/032_subscribe_use_index.pl
>> new file mode 100644
>> index 0000000000..db0a7ea2a0
>> --- /dev/null
>> +++ b/src/test/subscription/t/032_subscribe_use_index.pl
>> @@ -0,0 +1,861 @@
>> +# Copyright (c) 2021-2022, PostgreSQL Global Development Group
>> +
>> +# Test logical replication behavior with subscriber uses available index
>> +use strict;
>> +use warnings;
>> +use PostgreSQL::Test::Cluster;
>> +use PostgreSQL::Test::Utils;
>> +use Test::More;
>> +
>
>
> I changed it to #Copyright (c) 2022-2023, but I'm not sure if it should be only 2023 or
> like this.
>
>>
>>
>> 3) Many of the tests are using the same tables, we need not
>> drop/create publication/subscription for each of the team, we could
>> just drop and create required indexes and verify the update/delete
>> statements.
>> +# ====================================================================
>> +# Testcase start: SUBSCRIPTION USES INDEX
>> +#
>> +# Basic test where the subscriber uses index
>> +# and only updates 1 row and deletes
>> +# 1 other row
>> +#
>> +
>> +# create tables pub and sub
>> +$node_publisher->safe_psql('postgres',
>> +       "CREATE TABLE test_replica_id_full (x int)");
>> +$node_publisher->safe_psql('postgres',
>> +       "ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL;");
>> +$node_subscriber->safe_psql('postgres',
>> +       "CREATE TABLE test_replica_id_full (x int)");
>> +$node_subscriber->safe_psql('postgres',
>> +       "CREATE INDEX test_replica_id_full_idx ON test_replica_id_full(x)");
>>
>> +# ====================================================================
>> +# Testcase start: SUBSCRIPTION CREATE/DROP INDEX WORKS WITHOUT ISSUES
>> +#
>> +# This test ensures that after CREATE INDEX, the subscriber can automatically
>> +# use one of the indexes (provided that it fulfils the requirements).
>> +# Similarly, after DROP index, the subscriber can automatically switch to
>> +# sequential scan
>> +
>> +# create tables pub and sub
>> +$node_publisher->safe_psql('postgres',
>> +       "CREATE TABLE test_replica_id_full (x int NOT NULL, y int)");
>> +$node_publisher->safe_psql('postgres',
>> +       "ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL;");
>> +$node_subscriber->safe_psql('postgres',
>> +       "CREATE TABLE test_replica_id_full (x int NOT NULL, y int)");
>
>
> Well, not all the tables are exactly the same, there are 4-5 different
> tables. Mostly the table names are the same.
>
> Plus, the overhead does not seem to be large enough to complicate
> the test. Many of the src/test/subscription/t files follow this pattern.
>
> Do you have strong opinions on changing this?

I felt that once you remove the create publication/subscription/wait
for sync steps, the test execution might become faster and save some
time in the local execution, cfbot and the various machines in
buildfarm. If the execution time will not reduce, then no need to
change.

Regards,
Vignesh



pgsql-hackers by date:

Previous
From: Melih Mutlu
Date:
Subject: Re: Allow logical replication to copy tables in binary format
Next
From: vignesh C
Date:
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher