Re: Time delayed LR (WAS Re: logical replication restrictions) - Mailing list pgsql-hackers

From Takamichi Osumi (Fujitsu)
Subject Re: Time delayed LR (WAS Re: logical replication restrictions)
Date
Msg-id TYCPR01MB8373775ECC6972289AF8CB30ED0F9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: logical replication restrictions  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: Time delayed LR (WAS Re: logical replication restrictions)
Re: Time delayed LR (WAS Re: logical replication restrictions)
Re: Time delayed LR (WAS Re: logical replication restrictions)
Re: Time delayed LR (WAS Re: logical replication restrictions)
List pgsql-hackers
On Wednesday, October 5, 2022 6:42 PM Peter Smith <smithpb2250@gmail.com> wrote:
> Hi Euler, a long time ago you ask me a few questions about my previous review
> [1].
> 
> Here are my replies, plus a few other review comments for patch v7-0001.
Hi, thank you for your comments.


> ======
> 
> 1. doc/src/sgml/catalogs.sgml
> 
> +      <para>
> +       Application delay of changes by a specified amount of time. The
> +       unit is in milliseconds.
> +      </para></entry>
> 
> The wording sounds a bit strange still. How about below
> 
> SUGGESTION
> The length of time (ms) to delay the application of changes.
Fixed.


> =======
> 
> 2. Other documentation?
> 
> Maybe should say something on the Logical Replication Subscription page
> about this? (31.2 Subscription)
Added.

 
> =======
> 
> 3. doc/src/sgml/ref/create_subscription.sgml
> 
> +          synchronized, this may lead to apply changes earlier than expected.
> +          This is not a major issue because a typical setting of this parameter
> +          are much larger than typical time deviations between servers.
> 
> Wording?
> 
> SUGGESTION
> ... than expected, but this is not a major issue because this parameter is
> typically much larger than the time deviations between servers.
Fixed.



> ~~~
> 
> 4. Q/A
> 
> From [2] you asked:
> 
> > Should there also be a big warning box about the impact if using
> > synchronous_commit (like the other streaming replication page has this
> > warning)?
> Impact? Could you elaborate?
> 
> ~
> 
> I noticed the streaming replication docs for recovery_min_apply_delay has a big
> red warning box saying that setting this GUC may block the synchronous
> commits. So I was saying won’t a similar big red warning be needed also for
> this min_apply_delay parameter if the delay is used in conjunction with a
> publisher wanting synchronous commit because it might block everything?
I agree with you. Fixed.


 
> ~~~
> 
> 4. Example
> 
> +<programlisting>
> +CREATE SUBSCRIPTION foo
> +         CONNECTION 'host=192.168.1.50 port=5432 user=foo
> dbname=foodb'
> +        PUBLICATION baz
> +               WITH (copy_data = false, min_apply_delay = '4h');
> +</programlisting></para>
> 
> If the example named the subscription/publication as ‘mysub’ and ‘mypub’ I
> think it would be more consistent with the existing examples.
Fixed.


 
> ======
> 
> 5. src/backend/commands/subscriptioncmds.c - SubOpts
> 
> @@ -89,6 +91,7 @@ typedef struct SubOpts
>   bool disableonerr;
>   char    *origin;
>   XLogRecPtr lsn;
> + int64 min_apply_delay;
>  } SubOpts;
> 
> I feel it would be better to be explicit about the storage units. So call this
> member ‘min_apply_delay_ms’. E.g. then other code in
> parse_subscription_options will be more natural when you are converting using
> and assigning them to this member.
I don't think we use such names including units explicitly.
Could you please tell me a similar example for this ?



> ~~~
> 
> 6. - parse_subscription_options
> 
> + /*
> + * If there is no unit, interval_in takes second as unit. This
> + * parameter expects millisecond as unit so add a unit (ms) if
> + * there isn't one.
> + */
> 
> The comment feels awkward. How about below
> 
> SUGGESTION
> If no unit was specified, then explicitly add 'ms' otherwise the interval_in
> function would assume 'seconds'
Fixed.


> ~~~
> 
> 7. - parse_subscription_options
> 
> (This is a repeat of [1] review comment #12)
> 
> + if (opts->min_apply_delay < 0 && IsSet(supported_opts,
> SUBOPT_MIN_APPLY_DELAY))
> + ereport(ERROR,
> + errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
> + errmsg("option \"%s\" must not be negative", "min_apply_delay"));
> 
> Why is this code here instead of inside the previous code block where the
> min_apply_delay was assigned in the first place?
Changed.


> ======
> 
> 8. src/backend/replication/logical/worker.c  - apply_delay
> 
> + * When min_apply_delay parameter is set on subscriber, we wait long
> + enough to
> + * make sure a transaction is applied at least that interval behind the
> + * publisher.
> 
> "on subscriber" -> "on the subscription"
Fixed.



> ~~~
> 
> 9.
> 
> + * Apply delay only after all tablesync workers have reached READY
> + state. A
> + * tablesync worker are kept until it reaches READY state. If we allow
> + the
> 
> 
> Wording ??
> 
> "A tablesync worker are kept until it reaches READY state." ??
I removed the sentence.


> ~~~
> 
> 10.
> 
> 10a.
> + /* nothing to do if no delay set */
> 
> Uppercase comment
> /* Nothing to do if no delay set */
> 
> ~
> 
> 10b.
> + /* set apply delay */
> 
> Uppercase comment
> /* Set apply delay */
Both are fixed.


 
> ~~~
> 
> 11. - apply_handle_stream_prepare / apply_handle_stream_commit
> 
> The previous concern about incompatibility with the "Parallel Apply"
> work (see [1] review comments #17, #18) is still a pending issue, isn't it?
Yes, I think so.
Kindly have a look at [1].


> ======
> 
> 12. src/backend/utils/adt/timestamp.c interval_to_ms
> 
> +/*
> + * Given an Interval returns the number of milliseconds.
> + */
> +int64
> +interval_to_ms(const Interval *interval)
> 
> SUGGESTION
> Returns the number of milliseconds in the specified Interval.
Fixed.



> ~~~
> 
> 13.
> 
> 
> + /* adds portion time (in ms) to the previous result. */
> 
> Uppercase comment
> /* Adds portion time (in ms) to the previous result. *
Fixed.


> ======
> 
> 14. src/bin/pg_dump/pg_dump.c - getSubscriptions
> 
> + {
> + appendPQExpBufferStr(query, " s.suborigin,\n");
> + appendPQExpBufferStr(query, " s.subapplydelay\n"); }
> 
> This could be done using just a single appendPQExpBufferStr if you want to
> have 1 call instead of 2.
Made them together.


> ======
> 
> 15. src/bin/psql/describe.c - describeSubscriptions
> 
> + /* origin and min_apply_delay are only supported in v16 and higher */
> 
> Uppercase comment
> /* Origin and min_apply_delay are only supported in v16 and higher */
Fixed.


> ======
> 
> 16. src/include/catalog/pg_subscription.h
> 
> + int64 subapplydelay; /* Replication apply delay */
> +
> 
> Consider renaming this as 'subapplydelayms' to make the units perfectly clear.
Similar to the 5th comments, I can't find any examples for this.
I'd like to keep it general, which makes me feel it is more aligned with
existing codes.


> ======
> 
> 17. src/test/regress/sql/subscription.sql
> 
> Is [1] review comment 21 (There are some test cases for CREATE
> SUBSCRIPTION but there are no test cases for ALTER SUBSCRIPTION
> changing this new parameter.) still a pending item?
Added one test case for alter subscription.


Also, I removed the function of logicalrep_worker_wakeup()
that was trigged by AlterSubscription only when disabling the subscription.
This is achieved and replaced by another patch proposed in [2] in a general manner.

There are still some pending comments for this patch,
but I'll share the current patch once.

Lastly, thank you so much, Kuroda-san for giving me many advice and
suggestion for some modification of this patch.


[1] - https://www.postgresql.org/message-id/CAA4eK1JJFpgqE0ehAb7C9YFkJ-Xe-W1ZUPZptEfYjNJM4G-sLA%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/20221122004119.GA132961%40nathanxps13



Best Regards,
    Takamichi Osumi


Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: indentation in _hash_pgaddtup()
Next
From: "Takamichi Osumi (Fujitsu)"
Date:
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)