RE: Optionally automatically disable logical replication subscriptions on error - Mailing list pgsql-hackers

From osumi.takamichi@fujitsu.com
Subject RE: Optionally automatically disable logical replication subscriptions on error
Date
Msg-id TYCPR01MB8373CAAEFBB5CC29AB3AE4AFED089@TYCPR01MB8373.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Optionally automatically disable logical replication subscriptions on error  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Friday, March 4, 2022 3:55 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> Thank you for updating the patch.
> 
> Here are some comments on v26 patch:
Thank you for your review !



> +/*
> + * Disable the current subscription.
> + */
> +static void
> +DisableSubscriptionOnError(void)
> 
> This function now just updates the pg_subscription catalog so can we move it
> to pg_subscritpion.c while having this function accept the subscription OID to
> disable? If you agree, the function comment will also need to be updated.
Agreed. Fixed.


> ---
> +                /*
> +                 * First, ensure that we log the error message so
> that it won't be
> +                 * lost if some other internal error occurs in the
> following code.
> +                 * Then, abort the current transaction and send the
> stats message of
> +                 * the table synchronization failure in an idle state.
> +                 */
> +                HOLD_INTERRUPTS();
> +                EmitErrorReport();
> +                FlushErrorState();
> +                AbortOutOfAnyTransaction();
> +                RESUME_INTERRUPTS();
> +                pgstat_report_subscription_error(MySubscription->oid,
> + false);
> +
> +                if (MySubscription->disableonerr)
> +                {
> +                        DisableSubscriptionOnError();
> +                        proc_exit(0);
> +                }
> +
> +                PG_RE_THROW();
> 
> If the disableonerr is false, the same error is reported twice. Also, the code in
> PG_CATCH() in both start_apply() and start_table_sync() are almost the same.
> Can we create a common function to do post-error processing?
Yes. Also, calling PG_RE_THROW wasn't appropriate,
because in the previous v26, for the second error you mentioned,
the patch didn't call errstart when disable_on_error = false.
This was introduced by recent patch refactoring around this code and the rebase
of this patch, but has been fixed by your suggestion.


> The worker should exit with return code 1.
> I've attached a fixup patch for changes to worker.c for your reference. Feel free
> to adopt the changes.
Yes. I adopted almost all of your suggestion.
One thing I fixed was a comment that mentioned table sync
in worker_post_error_processing(), because start_apply()
also uses the function.


> 
> ---
> +
> +# Confirm that we have finished the table sync.
> +is( $node_subscriber->safe_psql(
> +                'postgres', qq(SELECT MAX(i), COUNT(*) FROM tbl)),
> +        "1|3",
> +        "subscription sub replicated data");
> +
> 
> Can we store the result to a local variable to check? I think it's more consistent
> with other tap tests.
Agreed. Fixed.


Attached the v27. Kindly review the patch.


Best Regards,
    Takamichi Osumi


Attachment

pgsql-hackers by date:

Previous
From: "shiy.fnst@fujitsu.com"
Date:
Subject: RE: Optionally automatically disable logical replication subscriptions on error
Next
From: Zhihong Yu
Date:
Subject: Re: timestamp for query in pg_stat_statements