Re: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id CAD21AoBDLiFHThzfzvrnViTKnsm-pM5YvfTys_96-jBSXpWYqw@mail.gmail.com
Whole thread Raw
In response to Re: Perform streaming logical transactions by background workers and parallel apply  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Perform streaming logical transactions by background workers and parallel apply  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Wed, Oct 12, 2022 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 11, 2022 at 5:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Oct 7, 2022 at 2:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > About your point that having different partition structures for
> > > publisher and subscriber, I don't know how common it will be once we
> > > have DDL replication. Also, the default value of
> > > publish_via_partition_root is false which doesn't seem to indicate
> > > that this is a quite common case.
> >
> > So how can we consider these concurrent issues that could happen only
> > when streaming = 'parallel'? Can we restrict some use cases to avoid
> > the problem or can we have a safeguard against these conflicts?
> >
>
> Yeah, right now the strategy is to disallow parallel apply for such
> cases as you can see in *0003* patch.

Tightening the restrictions could work in some cases but there might
still be coner cases and it could reduce the usability. I'm not really
sure that we can ensure such a deadlock won't happen with the current
restrictions. I think we need something safeguard just in case. For
example, if the leader apply worker is waiting for a lock acquired by
its parallel worker, it cancels the parallel worker's transaction,
commits its transaction, and restarts logical replication. Or the
leader can log the deadlock to let the user know.

>
> > We
> > could find a new problematic scenario in the future and if it happens,
> > logical replication gets stuck, it cannot be resolved only by apply
> > workers themselves.
> >
>
> I think users can change streaming option to on/off and internally the
> parallel apply worker can detect and restart to allow replication to
> proceed. Having said that, I think that would be a bug in the code and
> we should try to fix it. We may need to disable parallel apply in the
> problematic case.
>
> The other ideas that occurred to me in this regard are (a) provide a
> reloption (say parallel_apply) at table level and we can use that to
> bypass various checks like different Unique Key between
> publisher/subscriber, constraints/expressions having mutable
> functions, Foreign Key (when enabled on subscriber), operations on
> Partitioned Table. We can't detect whether those are safe or not
> (primarily because of a different structure in publisher and
> subscriber) so we prohibit parallel apply but if users use this
> option, we can allow it even in those cases.

The parallel apply worker is assigned per transaction, right? If so,
how can we know which tables are modified in the transaction in
advance? and what if two tables whose reloptions are true and false
are modified in the same transaction?

> (b) While enabling the
> parallel option in the subscription, we can try to match all the
> table(s) information of the publisher/subscriber. It will be tricky to
> make this work because say even if match some trigger function name,
> we won't be able to match the function body. The other thing is when
> at a later point the table definition is changed on the subscriber, we
> need to again validate the information between publisher and
> subscriber which I think would be difficult as we would be already in
> between processing some message and getting information from the
> publisher at that stage won't be possible.

Indeed.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Next
From: Aleksander Alekseev
Date:
Subject: Re: Pluggable toaster