Re: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Perform streaming logical transactions by background workers and parallel apply |
Date | |
Msg-id | CAA4eK1LbKORmo3n1iFV+qKmeiuHvvn4U2i9KGfg11b-QE5AUHQ@mail.gmail.com Whole thread Raw |
In response to | Re: Perform streaming logical transactions by background workers and parallel apply (Dilip Kumar <dilipbalaut@gmail.com>) |
List | pgsql-hackers |
On Tue, Dec 27, 2022 at 10:36 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Dec 27, 2022 at 9:15 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Dec 26, 2022 at 7:35 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > In the commit message, there is a statement like this > > > > > > "However, if the leader apply worker times out while attempting to > > > send a message to the > > > parallel apply worker, it will switch to "partial serialize" mode - in this > > > mode the leader serializes all remaining changes to a file and notifies the > > > parallel apply workers to read and apply them at the end of the transaction." > > > > > > I think it is a good idea to serialize the change to the file in this > > > case to avoid deadlocks, but why does the parallel worker need to wait > > > till the transaction commits to reading the file? I mean we can > > > switch the serialize state and make a parallel worker pull changes > > > from the file and if the parallel worker has caught up with the > > > changes then it can again change the state to "share memory" and now > > > the apply worker can again start sending through shared memory. > > > > > > I think generally streaming transactions are large and it is possible > > > that the shared memory queue gets full because of a lot of changes for > > > a particular transaction but later when the load switches to the other > > > transactions then it would be quite common for the worker to catch up > > > with the changes then it better to again take advantage of using > > > memory. Otherwise, in this case, we are just wasting resources > > > (worker/shared memory queue) but still writing in the file. > > > > > > > Note that there is a certain threshold timeout for which we wait > > before switching to serialize mode and normally it happens only when > > PA starts waiting on some lock acquired by the backend. Now, apart > > from that even if we decide to switch modes, the current BufFile > > mechanism doesn't have a good way for that. It doesn't allow two > > processes to open the same buffile at the same time which means we > > need to maintain multiple files to achieve the mode where we can > > switch back from serialize mode. We cannot let LA wait for PA to close > > the file as that could introduce another kind of deadlock. For > > details, see the discussion in the email [1]. The other problem is > > that we have no way to deal with partially sent data via a shared > > memory queue. Say, if we timeout while sending the data, we have to > > resend the same message until it succeeds which will be tricky because > > we can't keep retrying as that can lead to deadlock. I think if we try > > to build this new mode, it will be a lot of effort without equivalent > > returns. In common cases, we didn't see that we time out and switch to > > serialize mode. It is mostly in cases where PA starts to wait for the > > lock acquired by other backend or the machine is slow enough to deal > > with the number of parallel apply workers. So, it doesn't seem worth > > adding more complexity to the first version but we don't rule out the > > possibility of the same in the future if we really see such cases are > > common. > > > > [1] - https://www.postgresql.org/message-id/CAD21AoDScLvLT8JBfu5WaGCPQs_qhxsybMT%2BsMXJ%3DQrDMTyr9w%40mail.gmail.com > > Okay, I see. And once we change to serialize mode we can't release > the worker as well because we have already applied partial changes > under some transaction from a PA so we can not apply remaining from > the LA. I understand it might introduce a lot of complex design to > change it back to parallel apply mode but my only worry is that in > such cases we will be holding on to the parallel worker just to wait > till commit to reading from the spool file. But as you said it should > not be very common case so maybe this is fine. > Right and as said previously if required (which is not clear at this stage) we can develop it in the later version as well. -- With Regards, Amit Kapila.
pgsql-hackers by date: