RE: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From houzj.fnst@fujitsu.com
Subject RE: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id OS0PR01MB5716DEF013064BCAAA5054A194739@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to RE: Perform streaming logical transactions by background workers and parallel apply  ("kuroda.hayato@fujitsu.com" <kuroda.hayato@fujitsu.com>)
List pgsql-hackers
On Mon, Aug 22, 2022 20:50 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote:
> Dear Wang,
> 
> Thank you for updating the patch! Followings are comments about 
> v23-0001 and v23-0005.

Thanks for your comments.

> v23-0001
> 
> 01. logical-replication.sgml
> 
> +  <para>
> +   When the streaming mode is <literal>parallel</literal>, the finish LSN of
> +   failed transactions may not be logged. In that case, it may be necessary to
> +   change the streaming mode to <literal>on</literal> and cause the same
> +   conflicts again so the finish LSN of the failed transaction will be written
> +   to the server log. For the usage of finish LSN, please refer to <link
> +   linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ...
> +   SKIP</command></link>.
> +  </para>
> 
> I was not sure about streaming='off' mode. Is there any reasons that 
> only ON mode is focused?

Added off.

> 02. protocol.sgml
> 
> +      <varlistentry>
> +       <term>Int64 (XLogRecPtr)</term>
> +       <listitem>
> +        <para>
> +         The LSN of the abort. This field is available since protocol version
> +         4.
> +        </para>
> +       </listitem>
> +      </varlistentry>
> +
> +      <varlistentry>
> +       <term>Int64 (TimestampTz)</term>
> +       <listitem>
> +        <para>
> +         Abort timestamp of the transaction. The value is in number
> +         of microseconds since PostgreSQL epoch (2000-01-01). This field is
> +         available since protocol version 4.
> +        </para>
> +       </listitem>
> +      </varlistentry>
> +
> 
> It seems that changes are in the variablelist for stream commit.
> I think these are included in the stream abort message, so it should be moved.

Fixed.

> 03. decode.c
> 
> -                       ReorderBufferForget(ctx->reorder, parsed->subxacts[i], buf-
> >origptr);
> +                       ReorderBufferForget(ctx->reorder, 
> + parsed->subxacts[i], buf-
> >origptr,
> +                                                               
> + commit_time);
>                 }
> -               ReorderBufferForget(ctx->reorder, xid, buf->origptr);
> +               ReorderBufferForget(ctx->reorder, xid, buf->origptr, 
> + commit_time);
> 
> 'commit_time' has been passed as argument 'abort_time', I think it may 
> be confusing.
> How about adding a comment above, like:
> "In case of streamed transactions, they are regarded as being aborted 
> at commit_time"

IIRC, I free the comment above the loop might be more clear about this,
but I will think about it again. 

> 04. launcher.c
> 
> 04.a
> 
> +       worker->main_worker_pid = is_subworker ? MyProcPid : 0;
> 
> You can use InvalidPid instead of 0.
> (I thought pid should be represented by the datatype pid_t, but in 
> some codes it is defined as int...)
> 
> 04.b
> 
> +       worker->main_worker_pid = 0;
> 
> You can use InvalidPid instead of 0, same as above.

Improved

> 05. origin.c
> 
>  void
> -replorigin_session_setup(RepOriginId node)
> +replorigin_session_setup(RepOriginId node, int acquired_by)
> 
> IIUC the same slot can be used only when the apply main worker has 
> already acquired the slot and the subworker for the same subscription 
> tries to acquire, but it cannot understand from comments.
> How about adding comments, or an assertion that acquired_by is same as 
> session_replication_state->acquired_by ?
> Moreover acquired_by should be compared with InvalidPid, based on 
> above comments.

I think we have tried to check if 'acquired_by' and acquired_by of
slot are equal inside this function.

I am not sure if it's a good idea to use InvalidPid here ,as we set
session_replication_state->acquired_by(int) to 0(instead of -1) to indicate
that no worker acquire it.

> 06. proto.c
> 
>  void
>  logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
> -                                                         TransactionId subxid)
> +                                                         ReorderBufferTXN *txn, XLogRecPtr abort_lsn,
> +                                                         bool 
> + write_abort_lsn
> 
> I think write_abort_lsn may be not needed, because abort_lsn can be 
> used for controlling whether abort_XXX fields should be filled or not.

I think if the subscriber's version is lower than 16 (which won't handle the abort_XXX fields),
then we don't need to send the abort_XXX fields either.

> 07. worker.c
> 
> +/*
> + * The number of changes during one streaming block (only for apply
> background
> + * workers)
> + */
> +static uint32 nchanges = 0;
> 
> This variable is used only by the main apply worker, so the comment 
> seems not correct.
> How about "...(only for SUBSTREAM_PARALLEL case)"?

The previous comments seemed a bit confusing. I tried to improve this comments to this:
```
The number of changes sent to apply background workers during one streaming block.
```

> v23-0005
> 
> 08. monitoring.sgml
> 
> I cannot decide which option proposed in [1] is better, but followings 
> descriptions are needed in both cases.
> (In [2] I had intended to propose something like option 2)
> 
> 08.a
> 
> You can add a description that the field 'relid' will be NULL even for 
> apply background worker.
> 
> 08.b
> 
> You can add a description that fields 'received_lsn', 
> 'last_msg_send_time', 'last_msg_receipt_time', 'latest_end_lsn', 
> 'latest_end_time' will be NULL for apply background worker.

Improved

Regards,
Wang wei


pgsql-hackers by date:

Previous
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Perform streaming logical transactions by background workers and parallel apply
Next
From: vignesh C
Date:
Subject: Re: Handle infinite recursion in logical replication setup