RE: Optionally automatically disable logical replication subscriptions on error - Mailing list pgsql-hackers

From osumi.takamichi@fujitsu.com
Subject RE: Optionally automatically disable logical replication subscriptions on error
Date
Msg-id TYCPR01MB83736F1DE9A86E5FE64DC819ED349@TYCPR01MB8373.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Optionally automatically disable logical replication subscriptions on error  (Amit Kapila <amit.kapila16@gmail.com>)
Responses RE: Optionally automatically disable logical replication subscriptions on error
List pgsql-hackers
On Monday, February 14, 2022 8:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Thu, Jan 6, 2022 at 11:23 AM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> >
> > Kindly have a look at the attached v16.
> >
> 
> Few comments:
Hi, thank you for checking the patch !

> =============
> 1.
> @@ -3594,13 +3698,29 @@ ApplyWorkerMain(Datum main_arg)
>     apply_error_callback_arg.command,
>     apply_error_callback_arg.remote_xid,
>     errdata->message);
> - MemoryContextSwitchTo(ecxt);
> +
> + if (!MySubscription->disableonerr)
> + {
> + /*
> + * Some work in error recovery work is done. Switch to the old
> + * memory context and rethrow.
> + */
> + MemoryContextSwitchTo(ecxt);
> + PG_RE_THROW();
> + }
>   }
> + else if (!MySubscription->disableonerr) PG_RE_THROW();
> 
> - PG_RE_THROW();
> 
> Can't we combine these two different checks for
> 'MySubscription->disableonerr' if you do it as a separate if check after sending
> the stats message?
No, we can't. The second check of MySubscription->disableonerr is for the case
apply_error_callback_arg.command equals 0. We disable the subscription
on any errors. In other words, we need to rethrow the error in the case,
if the flag disableonerr is not set to true.

So, moving it to after sending
the stats message can't be done. At the same time, if we move
the disableonerr flag check outside of the apply_error_callback_arg.command condition
branch, we need to write another call of pgstat_report_subworker_error, with the
same arguments that we have now. This wouldn't be preferrable as well.

> 
> 2. Can we move the code related to tablesync worker and its error handing (the
> code insider if (am_tablesync_worker())) to a separate function say
> LogicalRepHandleTableSync() or something like that.
> 
> 3. Similarly, we can move apply-loop related code ("Run the main
> loop.") to a separate function say LogicalRepHandleApplyMessages().
> 
> If we do (2) and (3), I think the code in ApplyWorkerMain will look better. What
> do you think?
I agree with (2) and (3), since those contribute to better readability.

Attached a new patch v17 that addresses those refactorings.


Best Regards,
    Takamichi Osumi


Attachment

pgsql-hackers by date:

Previous
From: "wangw.fnst@fujitsu.com"
Date:
Subject: RE: Logical replication timeout problem
Next
From: John Naylor
Date:
Subject: Re: do only critical work during single-user vacuum?