Re: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id CAHut+Pu8LCtzdd_PMbUoueZ9W8D7-Hd6nTH30kFEGzhfTgtFEg@mail.gmail.com
Whole thread Raw
In response to RE: Perform streaming logical transactions by background workers and parallel apply  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses RE: Perform streaming logical transactions by background workers and parallel apply
List pgsql-hackers
Here are my review comments for patch v87-0002.

======
doc/src/sgml/config.sgml

1.
        <para>
-        Allows streaming or serializing changes immediately in
logical decoding.
         The allowed values of <varname>logical_replication_mode</varname> are
-        <literal>buffered</literal> and <literal>immediate</literal>. When set
-        to <literal>immediate</literal>, stream each change if
+        <literal>buffered</literal> and <literal>immediate</literal>.
The default
+        is <literal>buffered</literal>.
+       </para>

I didn't think it was necessary to say “of logical_replication_mode”.
IMO that much is already obvious because this is the first sentence of
the description for logical_replication_mode.

(see also review comment #4)

~~~

2.
+       <para>
+        On the publisher side, it allows streaming or serializing changes
+        immediately in logical decoding.  When set to
+        <literal>immediate</literal>, stream each change if
         <literal>streaming</literal> option (see optional parameters set by
         <link linkend="sql-createsubscription"><command>CREATE
SUBSCRIPTION</command></link>)
         is enabled, otherwise, serialize each change.  When set to
-        <literal>buffered</literal>, which is the default, decoding will stream
-        or serialize changes when <varname>logical_decoding_work_mem</varname>
-        is reached.
+        <literal>buffered</literal>, decoding will stream or serialize changes
+        when <varname>logical_decoding_work_mem</varname> is reached.
        </para>

2a.
"it allows" --> "logical_replication_mode allows"

2b.
"decoding" --> "the decoding"

~~~

3.
+       <para>
+        On the subscriber side, if <literal>streaming</literal> option is set
+        to <literal>parallel</literal>, this parameter also allows the leader
+        apply worker to send changes to the shared memory queue or to serialize
+        changes.  When set to <literal>buffered</literal>, the leader sends
+        changes to parallel apply workers via shared memory queue.  When set to
+        <literal>immediate</literal>, the leader serializes all changes to
+        files and notifies the parallel apply workers to read and apply them at
+        the end of the transaction.
+       </para>

"this parameter also allows" --> "logical_replication_mode also allows"

~~~

4.
        <para>
         This parameter is intended to be used to test logical decoding and
         replication of large transactions for which otherwise we need to
         generate the changes till <varname>logical_decoding_work_mem</varname>
-        is reached.
+        is reached. Moreover, this can also be used to test the transmission of
+        changes between the leader and parallel apply workers.
        </para>

"Moreover, this can also" --> "It can also"

I am wondering would this sentence be better put at the top of the GUC
description. So then the first paragraph becomes like this:


SUGGESTION (I've also added another sentence "The effect of...")

The allowed values are buffered and immediate. The default is
buffered. This parameter is intended to be used to test logical
decoding and replication of large transactions for which otherwise we
need to generate the changes till logical_decoding_work_mem is
reached. It can also be used to test the transmission of changes
between the leader and parallel apply workers. The effect of
logical_replication_mode is different for the publisher and
subscriber:

On the publisher side...

On the subscriber side...
======
.../replication/logical/applyparallelworker.c

5.
+ /*
+ * In immeidate mode, directly return false so that we can switch to
+ * PARTIAL_SERIALIZE mode and serialize remaining changes to files.
+ */
+ if (logical_replication_mode == LOGICAL_REP_MODE_IMMEDIATE)
+ return false;

Typo "immediate"

Also, I felt "directly" is not needed. "return false" and "directly
return false" is the same.

SUGGESTION
Using ‘immediate’ mode returns false to cause a switch to
PARTIAL_SERIALIZE mode so that the remaining changes will be
serialized.

======
src/backend/utils/misc/guc_tables.c

6.
  {
  {"logical_replication_mode", PGC_USERSET, DEVELOPER_OPTIONS,
- gettext_noop("Allows streaming or serializing each change in logical
decoding."),
- NULL,
+ gettext_noop("Controls the behavior of logical replication publisher
and subscriber"),
+ gettext_noop("If set to immediate, on the publisher side, it "
+ "allows streaming or serializing each change in "
+ "logical decoding. On the subscriber side, in "
+ "parallel streaming mode, it allows the leader apply "
+ "worker to serialize changes to files and notifies "
+ "the parallel apply workers to read and apply them at "
+ "the end of the transaction."),
  GUC_NOT_IN_SAMPLE
  },

6a. short description

User PoV behaviour should be the same. Instead, maybe say "controls
the internal behavior" or something like that?

~

6b. long description

IMO the long description shouldn’t mention ‘immediate’ mode first as it does.

BEFORE
If set to immediate, on the publisher side, ...

AFTER
On the publisher side, ...

------
Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: Mutable CHECK constraints?
Next
From: Amin
Date:
Subject: Getting relations accessed by a query using the raw query string