Re: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id CAA4eK1LbKORmo3n1iFV+qKmeiuHvvn4U2i9KGfg11b-QE5AUHQ@mail.gmail.com
Whole thread Raw
In response to Re: Perform streaming logical transactions by background workers and parallel apply  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Tue, Dec 27, 2022 at 10:36 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Dec 27, 2022 at 9:15 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Dec 26, 2022 at 7:35 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > In the commit message, there is a statement like this
> > >
> > > "However, if the leader apply worker times out while attempting to
> > > send a message to the
> > > parallel apply worker, it will switch to "partial serialize" mode -  in this
> > > mode the leader serializes all remaining changes to a file and notifies the
> > > parallel apply workers to read and apply them at the end of the transaction."
> > >
> > > I think it is a good idea to serialize the change to the file in this
> > > case to avoid deadlocks, but why does the parallel worker need to wait
> > > till the transaction commits to reading the file?  I mean we can
> > > switch the serialize state and make a parallel worker pull changes
> > > from the file and if the parallel worker has caught up with the
> > > changes then it can again change the state to "share memory" and now
> > > the apply worker can again start sending through shared memory.
> > >
> > > I think generally streaming transactions are large and it is possible
> > > that the shared memory queue gets full because of a lot of changes for
> > > a particular transaction but later when the load switches to the other
> > > transactions then it would be quite common for the worker to catch up
> > > with the changes then it better to again take advantage of using
> > > memory.  Otherwise, in this case, we are just wasting resources
> > > (worker/shared memory queue) but still writing in the file.
> > >
> >
> > Note that there is a certain threshold timeout for which we wait
> > before switching to serialize mode and normally it happens only when
> > PA starts waiting on some lock acquired by the backend. Now, apart
> > from that even if we decide to switch modes, the current BufFile
> > mechanism doesn't have a good way for that. It doesn't allow two
> > processes to open the same buffile at the same time which means we
> > need to maintain multiple files to achieve the mode where we can
> > switch back from serialize mode. We cannot let LA wait for PA to close
> > the file as that could introduce another kind of deadlock. For
> > details, see the discussion in the email [1]. The other problem is
> > that we have no way to deal with partially sent data via a shared
> > memory queue. Say, if we timeout while sending the data, we have to
> > resend the same message until it succeeds which will be tricky because
> > we can't keep retrying as that can lead to deadlock. I think if we try
> > to build this new mode, it will be a lot of effort without equivalent
> > returns. In common cases, we didn't see that we time out and switch to
> > serialize mode. It is mostly in cases where PA starts to wait for the
> > lock acquired by other backend or the machine is slow enough to deal
> > with the number of parallel apply workers. So, it doesn't seem worth
> > adding more complexity to the first version but we don't rule out the
> > possibility of the same in the future if we really see such cases are
> > common.
> >
> > [1] -
https://www.postgresql.org/message-id/CAD21AoDScLvLT8JBfu5WaGCPQs_qhxsybMT%2BsMXJ%3DQrDMTyr9w%40mail.gmail.com
>
> Okay, I see.  And once we change to serialize mode we can't release
> the worker as well because we have already applied partial changes
> under some transaction from a PA so we can not apply remaining from
> the LA.  I understand it might introduce a lot of complex design to
> change it back to parallel apply mode but my only worry is that in
> such cases we will be holding on to the parallel worker just to wait
> till commit to reading from the spool file.  But as you said it should
> not be very common case so maybe this is fine.
>

Right and as said previously if required (which is not clear at this
stage) we can develop it in the later version as well.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Next
From: John Naylor
Date:
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum