Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date
Msg-id CAA4eK1+mwRNc+B-Aeo4rHc5oofSagCo+AcJxL_hLu8V1pkD6Gg@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Önder Kalacı <onderkalaci@gmail.com>)
Responses Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Peter Smith <smithpb2250@gmail.com>)
Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Önder Kalacı <onderkalaci@gmail.com>)
List pgsql-hackers
On Wed, Mar 8, 2023 at 8:44 PM Önder Kalacı <onderkalaci@gmail.com> wrote:
>
>>
>> I felt that once you remove the create publication/subscription/wait
>> for sync steps, the test execution might become faster and save some
>> time in the local execution, cfbot and the various machines in
>> buildfarm. If the execution time will not reduce, then no need to
>> change.
>>
>
> So, as I noted earlier, there are different schemas. As far as I count, there are at least
> 7 different table definitions. I think all tables having the same name are maybe confusing?
>
> Even if I try to group the same table definitions, and avoid create publication/subscription/wait
> for sync steps, the total execution time of the test drops only ~5%. As far as I test, that does not
> seem to be the bottleneck for the tests.
>
> Well, I'm really not sure if it is really worth doing that. I think having each test independent of each
> other is really much easier to follow.
>

This new test takes ~9s on my machine whereas most other tests in
subscription/t take roughly 2-5s. I feel we should try to reduce the
test timing without sacrificing much of the functionality or code
coverage.  I think if possible we should try to reduce setup/teardown
cost for each separate test by combining them where possible. I have a
few comments on tests which also might help to optimize these tests.

1.
+# Testcase start: SUBSCRIPTION USES INDEX
+#
+# Basic test where the subscriber uses index
+# and only updates 1 row and deletes
+# 1 other row
...
...
+# Testcase start: SUBSCRIPTION USES INDEX UPDATEs MULTIPLE ROWS
+#
+# Basic test where the subscriber uses index
+# and updates 50 rows

...
+# Testcase start: SUBSCRIPTION USES INDEX WITH MULTIPLE COLUMNS
+#
+# Basic test where the subscriber uses index
+# and deletes 200 rows

I think to a good extent these tests overlap each other. I think we
can have just one test for the index with multiple columns that
updates multiple rows and have both updates and deletes.

2.
+# Testcase start: SUBSCRIPTION DOES NOT USE PARTIAL INDEX
...
...
+# Testcase start: SUBSCRIPTION DOES NOT USE INDEXES WITH ONLY EXPRESSIONS

Instead of having separate tests where we do all setups again, I think
it would be better if we create both the indexes in one test and show
that none of them is used.

3.
+# now, the update could either use the test_replica_id_full_idy or
test_replica_id_full_idy index
+# it is not possible for user to control which index to use

The name of the second index is wrong in the above comment.

4.
+# Testcase start: SUBSCRIPTION BEHAVIOR WITH ENABLE_INDEXSCAN

As we have removed enable_indexscan check, you should remove this test.

5. In general, the line length seems to vary a lot for different
multi-line comments. Though we are not strict in that it will look
better if there is consistency in that (let's have ~80 cols line
length for each comment in a single line).

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: proposal - get_extension_version function
Next
From: Peter Smith
Date:
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher