Re: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id CAFiTN-s-mOXbzvOnQOV3KU_=+m3bPb8K3k22SkeDNKbozTaEbQ@mail.gmail.com
Whole thread Raw
In response to RE: Perform streaming logical transactions by background workers and parallel apply  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses Re: Perform streaming logical transactions by background workers and parallel apply
List pgsql-hackers
On Tue, Aug 2, 2022 at 5:16 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Wednesday, July 27, 2022 4:22 PM houzj.fnst@fujitsu.com wrote:
> >
> > On Tuesday, July 26, 2022 5:34 PM Dilip Kumar <dilipbalaut@gmail.com>
> > wrote:
> >
> > > 3.
> > > Why are we restricting parallel apply workers only for the streamed
> > > transactions, because streaming depends upon the size of the logical
> > > decoding work mem so making steaming and parallel apply tightly
> > > coupled seems too restrictive to me.  Do we see some obvious problems
> > > in applying other transactions in parallel?
> >
> > We thought there could be some conflict failure and deadlock if we parallel
> > apply normal transaction which need transaction dependency check[1]. But I
> > will do some more research for this and share the result soon.
>
> After thinking about this, I confirmed that it would be easy to cause deadlock
> error if we don't have additional dependency analysis and COMMIT order preserve
> handling for parallel apply normal transaction.
>
> Because the basic idea to parallel apply normal transaction in the first
> version is that: the main apply worker will receive data from pub and pass them
> to apply bgworker without applying by itself. And only before the apply
> bgworker apply the final COMMIT command, it need to wait for any previous
> transaction to finish to preserve the commit order. It means we could pass the
> next transaction's data to another apply bgworker before the previous
> transaction is committed in the first apply bgworker.
>
> In this approach, we have to do the dependency analysis because it's easy to
> cause dead lock error when applying DMLs in parallel(See the attachment for the
> examples where the dead lock could happen). So, it's a bit different from
> streaming transaction.
>
> We could apply the next transaction only after the first transaction is
> committed in which approach we don't need the dependency analysis, but it would
> not bring noticeable performance improvement even if we start serval apply
> workers to do that because the actual DMLs are not performed in parallel.
>
> Based on above, we plan to first introduce the patch to perform streaming
> logical transactions by background workers, and then introduce parallel apply
> normal transaction which design is different and need some additional handling.

Yeah I think that makes sense.  Since the streamed transactions are
sent to standby interleaved so we can take advantage of parallelism
and along with that we can also avoid the I/O so that will also
speedup.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: Generalize ereport_startup_progress infrastructure
Next
From: vignesh C
Date:
Subject: Re: Handle infinite recursion in logical replication setup