RE: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From wangw.fnst@fujitsu.com
Subject RE: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id OS3PR01MB627594136B65FD029FBA59429E839@OS3PR01MB6275.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Perform streaming logical transactions by background workers and parallel apply  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Mon, Jul 4, 2022 at 14:47 AM Peter Smith <smithpb2250@gmail.com> wrote:
> Below are some review comments for patch v14-0004:

Thanks for your comments.

> 4.0 General.
> 
> This comment is an after-thought but as I write this mail I am
> wondering if most of this 0004 patch is even necessary at all? Instead
> of introducing a new column and all the baggage that goes with it,
> can't the same functionality be achieved just by toggling the
> streaming mode 'substream' value from 'p' (parallel) to 't' (on)
> whenever an error occurs causing a retry? Anyway, if you do change it
> this way then most of the following comments can be disregarded.

In the approach that you mentioned, after retrying, the transaction will always
be applied in "on" mode. This will change the user's setting. 
That is to say, in most cases, user needs to manually reset option "streaming"
to "parallel". I think it might not be very friendly.

> 4.6 src/backend/replication/logical/worker.c
> 
> 4.6.a - apply_handle_commit
> 
> + /* Set the flag that we will not retry later. */
> + set_subscription_retry(false);
> 
> But the comment is wrong, isn't it? Shouldn’t it just say that we are
> not *currently* retrying? And can’t this just anyway be redundant if
> only the catalog column has a DEFAULT value of false?
> 
> 4.6.b - apply_handle_prepare
> Ditto
> 
> 4.6.c - apply_handle_commit_prepared
> Ditto
> 
> 4.6.d - apply_handle_rollback_prepared
> Ditto
> 
> 4.6.e - apply_handle_stream_prepare
> Ditto
> 
> 4.6.f - apply_handle_stream_abort
> Ditto
> 
> 4.6.g - apply_handle_stream_commit
> Ditto

Set default value of the field "subretry" to "false" as you suggested.
We need to reset this field to false after retrying to apply a streaming
transaction in main apply worker ("on" mode).
I think this comment is not clear. So I change it to
```
Reset the retry flag.
```

> 4.7 src/backend/replication/logical/worker.c
> 
> 4.7.a - start_table_sync
> 
> @@ -3894,6 +3917,9 @@ start_table_sync(XLogRecPtr *origin_startpos,
> char **myslotname)
>   }
>   PG_CATCH();
>   {
> + /* Set the flag that we will retry later. */
> + set_subscription_retry(true);
> 
> Maybe this should say more like "Flag that the next apply will be the
> result of a retry"
> 
> 4.7.b - start_apply
> Ditto

Similar to the reply in #4.6, I changed it to `Set the retry flag.`.

> 4.8 src/backend/replication/logical/worker.c - set_subscription_retry
> 
> +
> +/*
> + * Set subretry of pg_subscription catalog.
> + *
> + * If retry is true, subscriber is about to exit with an error. Otherwise, it
> + * means that the changes was applied successfully.
> + */
> +static void
> +set_subscription_retry(bool retry)
> 
> "changes" -> "change" ?

I did not make it clear before.
I modified "changes" to "transaction".

> 4.8 src/backend/replication/logical/worker.c - set_subscription_retry
> 
> Isn't this flag only every used when streaming=parallel? But it does
> not seem ot be checking that anywhere before potentiall executing all
> this code when maybe will never be used.

Yes, currently this field is only checked by apply background worker.

> 4.9 src/include/catalog/pg_subscription.h
> 
> @@ -76,6 +76,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId)
> BKI_SHARED_RELATION BKI_ROW
>   bool subdisableonerr; /* True if a worker error should cause the
>   * subscription to be disabled */
> 
> + bool subretry; /* True if the previous apply change failed. */
> 
> I was wondering if you can give this column a DEFAULT value of false,
> because then perhaps most of the patch code from worker.c may be able
> to be eliminated.

Please refer to the reply to #4.6.

The rest of the comments are improved as suggested.
The new patches were attached in [1].

[1] -
https://www.postgresql.org/message-id/OS3PR01MB62755C6C9A75EB09F7218B589E839%40OS3PR01MB6275.jpnprd01.prod.outlook.com

Regards,
Wang wei

pgsql-hackers by date:

Previous
From: "wangw.fnst@fujitsu.com"
Date:
Subject: RE: Perform streaming logical transactions by background workers and parallel apply
Next
From: Michael Paquier
Date:
Subject: Re: Bump MIN_WINNT to 0x0600 (Vista) as minimal runtime in 16~