Re: logical replication restrictions - Mailing list pgsql-hackers

From Euler Taveira
Subject Re: logical replication restrictions
Date
Msg-id acfc1946-a73e-4e9d-86b3-b19cba225a41@www.fastmail.com
Whole thread Raw
In response to Re: logical replication restrictions  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Tue, Jul 5, 2022, at 5:41 AM, Peter Smith wrote:
Here are some review comments for your v4-0001 patch. I hope they are
useful for you.
Thanks for your review.

This thread name "logical replication restrictions" seems quite
unrelated to the patch here. Maybe it's better to start a new thread
otherwise nobody is going to recognise what this thread is really
about.
I agree that the $SUBJECT does not describe the proposal. I decided that it is
not worth creating a thread because (i) there are some interaction and they
could be monitoring this thread and (ii) the CF entry has the correct
description.

Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (specially to fix
errors that might cause data loss).
I changed the commit message a bit.

Maybe take some examples from the regression tests to show usage of
the new parameter
I don't think an example is really useful in a commit message. If you are
checking this commit, it is a matter of reading the regression tests or
documentation to obtain an example of how to use it.

I think this should say that the units are ms.
Unit included.

Is the "integer" type here correct? It might eventually be stored as
an integer, but IIUC (going by the tests) from the user point-of-view
this parameter is really "text" type for representing ms or interval,
right?
The internal representation is integer. The unit is correct. If you use units,
the format is text that what the section [1] calls "Numeric with Unit".  Even
if the user is unsure about its usage, an example might help here.

SUGGESTION
As with the physical replication feature (recovery_min_apply_delay),
it can be useful for logical replication to delay the data
replication.
It is not "data replication", it is applying changes. I reworded that sentence.

SUGGESTION
Time spent in logical decoding and in transferring the transaction may
reduce the actual wait time.
Changed.

If the system clocks on publisher and subscriber are not
+          synchronized, this may lead to apply changes earlier than expected.

Why just say "earlier than expected"? If the publisher's time is ahead
of the subscriber then the changes might also be *later* than
expected, right? So, perhaps it is better to just say "other than
expected".
This sentence is similar to another one in the recovery_min_apply_delay. I want
to emphasize the fact that even if you use a 30-minute delay, it might apply a
change that happened 29 minutes 55 seconds ago. The main reason for this
feature is to avoid modifying changes *earlier*. If it applies the change 30
minutes 5 seconds, it is fine.

Should there also be a big warning box about the impact if using
synchronous_commit (like the other streaming replication page has this
warning)?
Impact? Could you elaborate?

I think there should be some examples somewhere showing how to specify
this parameter. Maybe they are better added somewhere in "31.2
Subscription" and xrefed from here.
I added one example in the CREATE SUBSCRIPTION. We can add an example in the
section 31.2, however, since it is a new chapter I think it lacks examples for
the other options too (streaming, two_phase, copy_data, ...). It could be
submitted as a separate patch IMO.

I think there should be a default assignment to 0 (done where all the
other supported option defaults are set)
It could for completeness. the memset() takes care of it. Anyway, I added it to
the beginning of the parse_subscription_options().

+ if (opts->min_apply_delay < 0)
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("option \"%s\" must not be negative", "min_apply_delay"));
+

I thought this check only needs to be do within scope of the preceding
if - (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
strcmp(defel->defname, "min_apply_delay") == 0)
Fixed.

+ /*
+ * If this subscription has been disabled and it has an apply
+ * delay set, wake up the logical replication worker to finish
+ * it as soon as possible.
+ */
+ if (!opts.enabled && sub->applydelay > 0)

I did not really understand the logic why should the min_apply_delay
override the enabled=false? It is a called *minimum* delay so if it
ends up being way over the parameter value (because the subscription
is disabled) then why does that matter?
It doesn't. The main point of this code (as I tried to explain in the comment)
is to kill the worker as soon as possible if you disable the subscription.
Isn't the comment clear?

Subscription *MySubscription = NULL;
static bool MySubscriptionValid = false;
+TimestampTz MySubscriptionMinApplyDelayUntil = 0;

Looking at the only usage of this variable (in apply_delay) and how it
is used I did see why this cannot just be a local member of the
apply_delay function?
Good catch. A previous patch used that variable outside that function scope.

+/*
+ * Apply the informed delay for the transaction.
+ *
+ * A regular transaction uses the commit time to calculate the delay. A
+ * prepared transaction uses the prepare time to calculate the delay.
+ */
+static void
+apply_delay(TimestampTz ts)

I didn't think it needs to mention here about the different kinds of
transactions because where it comes from has nothing really to do with
this function's logic.
Fixed.

Refer to comment #14 about MySubscriptionMinApplyDelayUntil.
Fixed.

It seems to rely on the spooling happening at the end. But won't this
cause a problem later when/if the "parallel apply" patch [1] is pushed
and the stream bgworkers are doing stuff on the fly instead of
spooling at the end?

Or are you expecting that the "parallel apply" feature should be
disabled if there is any min_apply_delay parameter specified?
I didn't read the "parallel apply" patch yet.

Let's keep the alphabetical order of the parameters in COMPLETE_WITH, as per [2]
Fixed.

+ int64 subapplydelay; /* Replication apply delay */
+

IMO the comment should mention the units "(ms)"
I'm not sure. It should be documented in the catalogs. It is an important
information for user-visible interface. There are a few places in the
documentation that the unit is mentioned.

There are some test cases for CREATE SUBSCRIPTION but there are no
test cases for ALTER SUBSCRIPTION changing this new parameter.
I added a test to cover ALTER SUBSCRIPTION and also for the disabling a
subscription that contains a delay set.

I received the following error when trying to run these 'subscription' tests:
Fixed.


--
Euler Taveira

Attachment

pgsql-hackers by date:

Previous
From: torikoshia
Date:
Subject: Re: Should fix a comment referring to stats collector?
Next
From: John Naylor
Date:
Subject: Re: NAMEDATALEN increase because of non-latin languages