Re: Time delayed LR (WAS Re: logical replication restrictions) - Mailing list pgsql-hackers

From vignesh C
Subject Re: Time delayed LR (WAS Re: logical replication restrictions)
Date
Msg-id CALDaNm2d=+WBGye015DoACY_PzrRKdmwjeZwDa+Y9LXFqnT0Ow@mail.gmail.com
Whole thread Raw
In response to RE: Time delayed LR (WAS Re: logical replication restrictions)  ("Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com>)
Responses RE: Time delayed LR (WAS Re: logical replication restrictions)
List pgsql-hackers
On Thu, 12 Jan 2023 at 21:09, Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
>
> On Thursday, January 12, 2023 12:04 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
> > At Wed, 11 Jan 2023 12:46:24 +0000, "Hayato Kuroda (Fujitsu)"
> > <kuroda.hayato@fujitsu.com> wrote in
> > > them. Which version is better?
> >
> >
> > Some comments by a quick loock, different from the above.
> Horiguchi-san, thanks for your review !
>
>
> > +         CONNECTION 'host=192.168.1.50 port=5432 user=foo
> > dbname=foodb'
> >
> > I understand that we (not PG people, but IT people) are supposed to use in
> > documents a certain set of special addresses that is guaranteed not to be
> > routed in the field.
> >
> > > TEST-NET-1 : 192.0.2.0/24
> > > TEST-NET-2 : 198.51.100.0/24
> > > TEST-NET-3 : 203.0.113.0/24
> >
> > (I found 192.83.123.89 in the postgres_fdw doc, but it'd be another issue..)
> Fixed. If necessary we can create another thread for this.
>
> > +                     if (strspn(tmp, "-0123456789 ") == strlen(tmp))
> >
> > Do we need to bother spending another memory block for apparent non-digits
> > here?
> Yes. The characters are necessary to handle an issue reported in [1].
> The issue happened if the user inputs a negative value,
> then the length comparison became different between strspn and strlen
> and the input value was recognized as seconds, when
> the unit wasn't described. This led to a wrong error message for the user.
>
> Those addition of such characters solve the issue.
>
> > +                                             errmsg(INT64_FORMAT " ms
> > is outside the valid range for parameter
> > +\"%s\"",
> >
> > We don't use INT64_FORMAT in translatable message strings. Cast then
> > use %lld instead.
> Thanks for teaching us. Fixed.
>
> > This message looks unfriendly as it doesn't suggest the valid range, and it
> > shows the input value in a different unit from what was in the input. A I think we
> > can spell it as "\"%s\" is outside the valid range for subsciription parameter
> > \"%s\" (0 .. <INT_MAX> in millisecond)"
> Makes sense. I incorporated the valid range with the aligned format of recovery_min_apply_delay.
> FYI, the physical replication's GUC doesn't write the unites for the range like below.
> I followed and applied this style.
>
> ---
> LOG:  -1 ms is outside the valid range for parameter "recovery_min_apply_delay" (0 .. 2147483647)
> FATAL:  configuration file "/home/k5user/new/pg/l/make_v15/slave/postgresql.conf" contains errors
> ---
>
> > +     int64           min_apply_delay;
> > ..
> > +                     if (ms < 0 || ms > INT_MAX)
> >
> > Why is the variable wider than required?
> You are right. Fixed.
>
> > +                                     errmsg("%s and %s are mutually
> > exclusive options",
> > +                                                "min_apply_delay > 0",
> > "streaming = parallel"));
> >
> > Mmm. Couldn't we refuse 0 as min_apply_delay?
> Sorry, the previous patch's behavior wasn't consistent with this error message.
>
> In the previous patch, if we conducted alter subscription
> with stream = parallel and min_apply_delay = 0 (from a positive value) at the same time,
> the alter command failed, although this should succeed by this time-delayed feature specification.
> We fixed this part accordingly by some more tests in AlterSubscription().
>
> By the way, we should allow users to change min_apply_dealy to 0
> whenever they want from different value. Then, we didn't restrict
> this kind of operation.
>
> > +                                             sub->minapplydelay > 0)
> > ...
> > +                                     if (opts.min_apply_delay > 0 &&
> >
> > Is there any reason for the differenciation?
> Yes. The former is the object for an existing subscription configuration.
> For example, if we alter subscription with setting streaming = 'parallel'
> for a subscription created with min_apply_delay = '1 day', we
> need to reject the alter command. The latter is new settings.
>
>
> > +
> >       errmsg("cannot set %s for subscription with %s",
> > +
> > "streaming = parallel", "min_apply_delay > 0"));
> >
> > I think that this shoud be more like human-speking. Say, "cannot enable
> > min_apply_delay for subscription in parallel streaming mode" or something..
> > The same is applicable to the nearby message.
> Reworded the error messages. Please check.
>
> > +static void maybe_delay_apply(TimestampTz ts);
> >
> >   apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
> > -                                        XLogRecPtr lsn)
> > +                                        XLogRecPtr lsn, TimestampTz ts)
> >
> > "ts" looks too generic. Couldn't it be more specific?
> > We need a explanation for the parameter in the function comment.
> Changed it to finish_ts, since it indicates commit/prepare time.
> This terminology should be aligned with finish lsn.
>
> > +     if (!am_parallel_apply_worker())
> > +     {
> > +             Assert(ts > 0);
> > +             maybe_delay_apply(ts);
> >
> > It seems to me better that the if condition and assertion are checked inside
> > maybe_delay_apply().
> Fixed.
>

Thanks for the updated patch, Few comments:
1) Since the min_apply_delay = 3, but you have specified 2s, there
might be a possibility that it can log delay as 1000ms due to
pub/sub/network delay and the test can fail randomly, If we cannot
ensure this log file value, check_apply_delay_time verification alone
should be sufficient.
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay", "2000");

2) I'm not sure if this will add any extra coverage as the altering
value of min_apply_delay is already tested in the regression, if so
this test can be removed:
+# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
+$node_subscriber->safe_psql('postgres',
+       "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)"
+);
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+       "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+check_apply_delay_log("logical replication apply delay", "80000000");

3) We generally keep the subroutines before the tests, it can be kept
accordingly:
3.a)
+sub check_apply_delay_log
+{
+       my ($message, $expected) = @_;
+       $expected = 0 unless defined $expected;
+
+       my $old_log_location = $log_location;

3.b)
+sub check_apply_delay_time
+{
+       my ($primary_key, $expected_diffs) = @_;
+
+       my $inserted_time_on_pub = $node_publisher->safe_psql('postgres', qq[
+               SELECT extract(epoch from c) FROM test_tab WHERE a =
$primary_key;
+       ]);
+

4) typo "more then once" should be "more than once"
+ regress_testsub | regress_subscription_user | f       |
{testpub,testpub1,testpub2} | f      | off       | d                |
f                | any    |                    0 | off
| dbname=regress_doesnotexist | 0/0
 (1 row)

 -- fail - publication used more then once
@@ -316,10 +316,10 @@ ERROR:  publication "testpub3" is not in
subscription "regress_testsub"
 -- ok - delete publications
 ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1,
testpub2 WITH (refresh = false);
 \dRs+

5) This can be changed to "Is it larger than expected?"
+               # Is it larger than expected ?
+               cmp_ok($logged_delay, '>', $expected,
+                       "The wait time of the apply worker is long
enough expectedly"
+               );

6) 2022 should be changed to 2023
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,170 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay

7) Termination full stop is not required for single line comments:
7.a)
+use Test::More;
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');

7.b) +
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');

7.c) +
+# Create some preexisting content on publisher.
+$node_publisher->safe_psql('postgres',

7.d) similarly in rest of the files

8) Is it possible to add one test for spooling also?

Regards,
Vignesh



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Add BufFileRead variants with short read and EOF detection
Next
From: Amit Kapila
Date:
Subject: Re: backup.sgml typo