RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers

From kuroda.hayato@fujitsu.com
Subject RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date
Msg-id TYAPR01MB58666F55F422D8E89B33D453F5239@TYAPR01MB5866.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Önder Kalacı <onderkalaci@gmail.com>)
Responses Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Önder Kalacı <onderkalaci@gmail.com>)
List pgsql-hackers
Dear Önder,

Thanks for updating the patch! I checked yours and almost good.
Followings are just cosmetic comments.

===
01. relation.c - GetCheapestReplicaIdentityFullPath

```
     * The reason that the planner would not pick partial indexes and indexes
     * with only expressions based on the way currently baserestrictinfos are
     * formed (e.g., col_1 = $1 ... AND col_N = $2).
```

Is "col_N = $2" a typo? I think it should be "col_N = $N" or "attr1 = $1 ... AND attrN = $N".

===
02. 032_subscribe_use_index.pl

If a table has a primary key on the subscriber, it will be used even if enable_indexscan is false(legacy behavior).
Should we test it?

~~~
03. 032_subscribe_use_index.pl -  SUBSCRIPTION RE-CALCULATES INDEX AFTER CREATE/DROP INDEX

I think this test seems to be not trivial, so could you write down the motivation?

~~~
04. 032_subscribe_use_index.pl -  SUBSCRIPTION RE-CALCULATES INDEX AFTER CREATE/DROP INDEX

```
# wait until the index is created
$node_subscriber->poll_query_until(
    'postgres', q{select count(*)=1 from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';}
) or die "Timed out while waiting for check subscriber tap_sub_rep_full_0 updates one row via index";
```

CREATE INDEX is a synchronous behavior, right? If so we don't have to wait here.
...And the comment in case of die may be wrong.
(There are some cases like this)

~~~
05. 032_subscribe_use_index.pl - SUBSCRIPTION USES INDEX UPDATEs MULTIPLE ROWS

```
# Testcase start: SUBSCRIPTION USES INDEX UPDATEs MULTIPLE ROWS
#
# Basic test where the subscriber uses index
# and touches 50 rows with UPDATE
```

"touches 50 rows with UPDATE" -> "updates 50 rows", per other tests.

~~~
06. 032_subscribe_use_index.pl - SUBSCRIPTION CAN UPDATE THE INDEX IT USES AFTER ANALYZE

I think this test seems to be not trivial, so could you write down the motivation?
(Same as Re-calclate)

~~~
07. 032_subscribe_use_index.pl - SUBSCRIPTION CAN UPDATE THE INDEX IT USES AFTER ANALYZE

```
# show that index_b is not used
$node_subscriber->poll_query_until(
    'postgres', q{select idx_scan=0 from pg_stat_all_indexes where indexrelname = 'index_b';}
) or die "Timed out while waiting for check subscriber tap_sub_rep_full updates two rows via index scan with index on
highcardinality column-2";
 
```

I think we don't have to wait here, is() should be used instead. 
poll_query_until() should be used only when idx_scan>0 is checked.
(There are some cases like this)

~~~
08. 032_subscribe_use_index.pl - SUBSCRIPTION USES INDEX ON PARTITIONED TABLES

```
# make sure that the subscriber has the correct data
$node_subscriber->poll_query_until(
    'postgres', q{select sum(user_id+value_1+value_2)=550070 AND count(DISTINCT(user_id,value_1, value_2))=981 from
users_table_part;}
) or die "ensure subscriber has the correct data at the end of the test";
```

I think we can replace it to wait_for_catchup() and is()...
Moreover, we don't have to wait here because in above line we wait until the index is used on the subscriber.
(There are some cases like this)


Best Regards,
Hayato Kuroda
FUJITSU LIMITED


pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: create subscription - improved warning message
Next
From: Ankit Oza
Date:
Subject: PostgreSQL Logical decoding