Re: Slow catchup of 2PC (twophase) transactions on replica in LR - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Slow catchup of 2PC (twophase) transactions on replica in LR
Date
Msg-id CAA4eK1LnTBBrk1OH20y=0aJSOiE+ziDiLFsqD2yUYOW5FubScA@mail.gmail.com
Whole thread Raw
In response to Re: Slow catchup of 2PC (twophase) transactions on replica in LR  (vignesh C <vignesh21@gmail.com>)
List pgsql-hackers
On Tue, Jul 30, 2024 at 4:02 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, 25 Jul 2024 at 08:39, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Jul 24, 2024 at 9:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >
> > > Amit Kapila <amit.kapila16@gmail.com> writes:
> > > > I merged these changes, made a few other cosmetic changes, and pushed the patch.
> > >
> > > There is a CF entry pointing at this thread [1].  Should it be closed?
> > >
> >
> > Yes, closed now. Thanks for the reminder.
>
> I noticed one random test failure in my environment with 021_twophase test.
> [10:37:01.131](0.053s) ok 24 - should be no prepared transactions on subscriber
> error running SQL: 'psql:<stdin>:2: ERROR:  cannot alter two_phase
> when logical replication worker is still running
> HINT:  Try again after some time.'
>
> We can reproduce the issue by adding a delay at apply_worker_exit like
> in the attached Reproduce_random_021_twophase_test_failure.patch
> patch.
>
> This is happening because the check here is wrong:
> +$node_subscriber->poll_query_until('postgres',
> +   "SELECT count(*) = 0 FROM pg_stat_activity WHERE backend_type =
> 'logical replication worker'"
>
> Here "logical replication worker" should be "logical replication apply worker".
>
> Attached patch has the changes for the same.
>

LGTM.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Comment in portal.h
Next
From: Thomas Munro
Date:
Subject: Re: Remove last traces of HPPA support