Re: Introduce wait_for_subscription_sync for TAP tests - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Introduce wait_for_subscription_sync for TAP tests
Date
Msg-id CAA4eK1Jg9KQXp065efTkW1w-m9hEexMUTVf-7AHwdusTWVAncw@mail.gmail.com
Whole thread Raw
In response to Introduce wait_for_subscription_sync for TAP tests  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Introduce wait_for_subscription_sync for TAP tests
List pgsql-hackers
On Tue, Jul 26, 2022 at 7:07 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Hi,
>
> In tap tests for logical replication, we have the following code in many places:
>
> $node_publisher->wait_for_catchup('tap_sub');
> my $synced_query =
>   "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT
> IN ('r', 's');";
> $node_subscriber->poll_query_until('postgres', $synced_query)
>   or die "Timed out while waiting for subscriber to synchronize data";
>
> Also, we sometime forgot to check either one, like we fixed in commit
> 1f50918a6fb02207d151e7cb4aae4c36de9d827c.
>
> I think we can have a new function to wait for all subscriptions to
> synchronize data. The attached patch introduce a new function
> wait_for_subscription_sync(). With this function, we can replace the
> above code with this one function as follows:
>
> $node_subscriber->wait_for_subscription_sync($node_publisher, 'tap_sub');
>

+1. This reduces quite some code in various tests and will make it
easier to write future tests.

Few comments/questions:
====================
1.
-$node_publisher->wait_for_catchup('mysub1');
-
-# Also wait for initial table sync to finish.
-my $synced_query =
-  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT
IN ('r', 's');";
-$node_subscriber->poll_query_until('postgres', $synced_query)
-  or die "Timed out while waiting for subscriber to synchronize data";
-
 # Also wait for initial table sync to finish.
-$node_subscriber->poll_query_until('postgres', $synced_query)
-  or die "Timed out while waiting for subscriber to synchronize data";
+$node_subscriber->wait_for_subscription_sync($node_publisher, 'mysub1');

It seems to me without your patch there is an extra poll in the above
test. If so, we can probably remove that in a separate patch?

2.
+    # wait for the replication to catchup if required.
+    if (defined($publisher))
+    {
+ croak 'subscription name must be specified' unless defined($subname);
+ $publisher->wait_for_catchup($subname, 'replay');
+    }
+
+    # then, wait for all table states to be ready.
+    print "Waiting for all subscriptions in \"$name\" to synchronize data\n";
+    my $query = qq[SELECT count(1) = 0
+     FROM pg_subscription_rel
+     WHERE srsubstate NOT IN ('r', 's');];
+    $self->poll_query_until($dbname, $query)
+ or croak "timed out waiting for subscriber to synchronize data";

In the tests, I noticed that a few places did wait_for_catchup after
the subscription check, and at other places, we did that check before
as you have it here. Ideally, I think wait_for_catchup should be after
confirming the initial sync is over as without initial sync, the
publisher node won't be completely in sync with the subscriber. What
do you think?

3. In the code quoted in the previous point, why did you pass the
second parameter as 'replay' when we have not used that in the tests
otherwise?

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: fairywren hung in pg_basebackup tests
Next
From: Julien Rouhaud
Date:
Subject: Re: Allow file inclusion in pg_hba and pg_ident files