Re: Time delayed LR (WAS Re: logical replication restrictions) - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Time delayed LR (WAS Re: logical replication restrictions)
Date
Msg-id CAHut+Puwrb_DEV_7ANSKrQr23-dAmL9vrNV8ww7XsAP+kej0uw@mail.gmail.com
Whole thread Raw
In response to RE: Time delayed LR (WAS Re: logical replication restrictions)  ("Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com>)
Responses RE: Time delayed LR (WAS Re: logical replication restrictions)
List pgsql-hackers
Hi Osumi-san, here are my review comments for the latest patch v17-0001.

======
Commit Message

1.
Prohibit the combination of this feature and parallel streaming mode.

SUGGESTION (using the same wording as in the code comments)
The combination of parallel streaming mode and min_apply_delay is not allowed.

======
doc/src/sgml/ref/create_subscription.sgml

2.
+         <para>
+          By default, the subscriber applies changes as soon as possible. This
+          parameter allows the user to delay the application of changes by a
+          specified amount of time. If the value is specified without units, it
+          is taken as milliseconds. The default is zero (no delay).
+         </para>

Looking at this again, it seemed a  bit strange to repeat "specified"
twice in 2 sentences. Maybe change one of them.

I’ve also suggested using the word "interval" because I don’t think
docs yet mentioned anywhere (except in the example) that using
intervals is possible.

SUGGESTION (for the 2nd sentence)
This parameter allows the user to delay the application of changes by
a given time interval.

~~~

3.
+         <para>
+          Any delay occurs only on WAL records for transaction begins after all
+          initial table synchronization has finished. The delay is calculated
+          between the WAL timestamp as written on the publisher and the current
+          time on the subscriber. Any overhead of time spent in
logical decoding
+          and in transferring the transaction may reduce the actual wait time.
+          It is also possible that the overhead already execeeds the requested
+          <literal>min_apply_delay</literal> value, in which case no additional
+          wait is necessary. If the system clocks on publisher and subscriber
+          are not synchronized, this may lead to apply changes earlier than
+          expected, but this is not a major issue because this parameter is
+          typically much larger than the time deviations between servers. Note
+          that if this parameter is set to a long delay, the replication will
+          stop if the replication slot falls behind the current LSN
by more than
+          <link
linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+         </para>

3a.
Typo "execeeds" (I think Vignesh reported this already)

~

3b.
SUGGESTION (for the 2nd sentence)
BEFORE
The delay is calculated between the WAL timestamp...
AFTER
The delay is calculated as the difference between the WAL timestamp...

~~~

4.
+         <warning>
+           <para>
+            Delaying the replication can mean there is a much longer
time between making
+            a change on the publisher, and that change being
committed on the subscriber.
+            v
+            See <xref linkend="guc-synchronous-commit"/>.
+           </para>
+         </warning>

IMO maybe there is a better way to express the 2nd sentence:

BEFORE
This can have a big impact on synchronous replication.
AFTER
This can impact the performance of synchronous replication.

======
src/backend/commands/subscriptioncmds.c

5. parse_subscription_options

@@ -324,6 +328,43 @@ parse_subscription_options(ParseState *pstate,
List *stmt_options,
  opts->specified_opts |= SUBOPT_LSN;
  opts->lsn = lsn;
  }
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char    *val,
+    *tmp;
+ Interval   *interval;
+ int64 ms;

IMO 'delay_ms' (or similar) would be a friendlier variable name than just 'ms'

~~~

6.
@@ -404,6 +445,20 @@ parse_subscription_options(ParseState *pstate,
List *stmt_options,
  "slot_name = NONE", "create_slot = false")));
  }
  }
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0)
+ {
+ if (opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s and %s are mutually exclusive options",
+    "min_apply_delay > 0", "streaming = parallel"));
+ }

This could be expressed as a single condition using &&, maybe also
with the brackets eliminated. (Unless you feel the current code is
more readable)

~~~

7.

+ if (opts.min_apply_delay > 0)
+ if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming
== LOGICALREP_STREAM_PARALLEL) ||
+ (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream ==
LOGICALREP_STREAM_PARALLEL))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+    "min_apply_delay", "streaming = parallel"));

These nested ifs could instead be a single "if" with && condition.
(Unless you feel the current code is more readable)


======
src/backend/replication/logical/worker.c

8. maybe_delay_apply

+ * Hence, it's not appropriate to apply a delay at the time.
+ */
+static void
+maybe_delay_apply(TimestampTz finish_ts)

That last sentence "Hence,... delay at the time" does not sound
correct. Is there a typo or missing words here?

Maybe it meant to say "... at the STREAM START time."?

~~~

9.
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }

I was unsure why did you make a special mention of
'wal_receiver_status_interval' here. I mean, Aren't there also other
GUCs that might change and affect something here so was there some
special reason only this one was mentioned?

======
src/test/subscription/t/032_apply_delay.pl

10.
+
+# Compare inserted time on the publisher with applied time on the subscriber to
+# confirm the latter is applied after expected time.
+sub check_apply_delay_time

Maybe the comment could also mention that the time is automatically
stored in the table column 'c'.

~~~

11.
+# Confirm the suspended record doesn't get applied expectedly by the ALTER
+# DISABLE command.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab WHERE a = 0;");
+is($result, qq(0), "check if the delayed transaction doesn't get
applied expectedly");

The use of "doesn't get applied expectedly" (in 2 places here) seemed
strange. Maybe it's better to say like

SUGGESTION
# Confirm disabling the subscription by ALTER DISABLE did not cause
the delayed transaction to be applied.
$result = $node_subscriber->safe_psql('postgres',
"SELECT count(a) FROM test_tab WHERE a = 0;");
is($result, qq(0), "check the delayed transaction was not applied");

------
Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: David Geier
Date:
Subject: Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?
Next
From: "wangw.fnst@fujitsu.com"
Date:
Subject: RE: Logical replication timeout problem