RE: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From kuroda.hayato@fujitsu.com
Subject RE: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id TYAPR01MB5866B1CC4687B3A16634EB5EF5719@TYAPR01MB5866.jpnprd01.prod.outlook.com
Whole thread Raw
In response to RE: Perform streaming logical transactions by background workers and parallel apply  ("wangw.fnst@fujitsu.com" <wangw.fnst@fujitsu.com>)
Responses Re: Perform streaming logical transactions by background workers and parallel apply
RE: Perform streaming logical transactions by background workers and parallel apply
RE: Perform streaming logical transactions by background workers and parallel apply
List pgsql-hackers
Dear Wang,

Thank you for updating the patch! Followings are comments about v23-0001 and v23-0005.

v23-0001

01. logical-replication.sgml

+  <para>
+   When the streaming mode is <literal>parallel</literal>, the finish LSN of
+   failed transactions may not be logged. In that case, it may be necessary to
+   change the streaming mode to <literal>on</literal> and cause the same
+   conflicts again so the finish LSN of the failed transaction will be written
+   to the server log. For the usage of finish LSN, please refer to <link
+   linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ...
+   SKIP</command></link>.
+  </para>

I was not sure about streaming='off' mode. Is there any reasons that only ON mode is focused?

02. protocol.sgml

+      <varlistentry>
+       <term>Int64 (XLogRecPtr)</term>
+       <listitem>
+        <para>
+         The LSN of the abort. This field is available since protocol version
+         4.
+        </para>
+       </listitem>
+      </varlistentry>
+
+      <varlistentry>
+       <term>Int64 (TimestampTz)</term>
+       <listitem>
+        <para>
+         Abort timestamp of the transaction. The value is in number
+         of microseconds since PostgreSQL epoch (2000-01-01). This field is
+         available since protocol version 4.
+        </para>
+       </listitem>
+      </varlistentry>
+

It seems that changes are in the variablelist for stream commit.
I think these are included in the stream abort message, so it should be moved.

03. decode.c

-                       ReorderBufferForget(ctx->reorder, parsed->subxacts[i], buf->origptr);
+                       ReorderBufferForget(ctx->reorder, parsed->subxacts[i], buf->origptr,
+                                                               commit_time);
                }
-               ReorderBufferForget(ctx->reorder, xid, buf->origptr);
+               ReorderBufferForget(ctx->reorder, xid, buf->origptr, commit_time);

'commit_time' has been passed as argument 'abort_time', I think it may be confusing.
How about adding a comment above, like:
"In case of streamed transactions, they are regarded as being aborted at commit_time"

04. launcher.c

04.a

+       worker->main_worker_pid = is_subworker ? MyProcPid : 0;

You can use InvalidPid instead of 0.
(I thought pid should be represented by the datatype pid_t, but in some codes it is defined as int...) 

04.b

+       worker->main_worker_pid = 0;

You can use InvalidPid instead of 0, same as above.

05. origin.c

 void
-replorigin_session_setup(RepOriginId node)
+replorigin_session_setup(RepOriginId node, int acquired_by)

IIUC the same slot can be used only when the apply main worker has already acquired the slot
and the subworker for the same subscription tries to acquire, but it cannot understand from comments.
How about adding comments, or an assertion that acquired_by is same as session_replication_state->acquired_by ?
Moreover acquired_by should be compared with InvalidPid, based on above comments.

06. proto.c

 void
 logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
-                                                         TransactionId subxid)
+                                                         ReorderBufferTXN *txn, XLogRecPtr abort_lsn,
+                                                         bool write_abort_lsn

I think write_abort_lsn may be not needed,
because abort_lsn can be used for controlling whether abort_XXX fields should be filled or not.

07. worker.c

+/*
+ * The number of changes during one streaming block (only for apply background
+ * workers)
+ */
+static uint32 nchanges = 0;

This variable is used only by the main apply worker, so the comment seems not correct.
How about "...(only for SUBSTREAM_PARALLEL case)"?

v23-0005

08. monitoring.sgml

I cannot decide which option proposed in [1] is better, but followings descriptions are needed in both cases.
(In [2] I had intended to propose something like option 2)

08.a

You can add a description that the field 'relid' will be NULL even for apply background worker.

08.b

You can add a description that fields 'received_lsn', 'last_msg_send_time', 'last_msg_receipt_time',
'latest_end_lsn', 'latest_end_time' will be NULL for apply background worker.


[1]:
https://www.postgresql.org/message-id/CAHut%2BPuPwdwZqXBJjtU%2BR9NULbOpxMG%3Di2hmqgg%2B7p0rmK0hrw%40mail.gmail.com
[2]:
https://www.postgresql.org/message-id/TYAPR01MB58660B4732E7F80B322174A3F5629%40TYAPR01MB5866.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: [PATCH] Tab completion for SET COMPRESSION
Next
From: Nikita Malakhov
Date:
Subject: Re: [PATCH] ALTER TABLE ... SET STORAGE default