Re: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id CAA4eK1KXB4-YnUNzduv3hOo3QhsdcE8P8Bg7tSbOvKzjkeoEHg@mail.gmail.com
Whole thread Raw
In response to Re: Perform streaming logical transactions by background workers and parallel apply  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: Perform streaming logical transactions by background workers and parallel apply  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Mon, Dec 26, 2022 at 7:35 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> In the commit message, there is a statement like this
>
> "However, if the leader apply worker times out while attempting to
> send a message to the
> parallel apply worker, it will switch to "partial serialize" mode -  in this
> mode the leader serializes all remaining changes to a file and notifies the
> parallel apply workers to read and apply them at the end of the transaction."
>
> I think it is a good idea to serialize the change to the file in this
> case to avoid deadlocks, but why does the parallel worker need to wait
> till the transaction commits to reading the file?  I mean we can
> switch the serialize state and make a parallel worker pull changes
> from the file and if the parallel worker has caught up with the
> changes then it can again change the state to "share memory" and now
> the apply worker can again start sending through shared memory.
>
> I think generally streaming transactions are large and it is possible
> that the shared memory queue gets full because of a lot of changes for
> a particular transaction but later when the load switches to the other
> transactions then it would be quite common for the worker to catch up
> with the changes then it better to again take advantage of using
> memory.  Otherwise, in this case, we are just wasting resources
> (worker/shared memory queue) but still writing in the file.
>

Note that there is a certain threshold timeout for which we wait
before switching to serialize mode and normally it happens only when
PA starts waiting on some lock acquired by the backend. Now, apart
from that even if we decide to switch modes, the current BufFile
mechanism doesn't have a good way for that. It doesn't allow two
processes to open the same buffile at the same time which means we
need to maintain multiple files to achieve the mode where we can
switch back from serialize mode. We cannot let LA wait for PA to close
the file as that could introduce another kind of deadlock. For
details, see the discussion in the email [1]. The other problem is
that we have no way to deal with partially sent data via a shared
memory queue. Say, if we timeout while sending the data, we have to
resend the same message until it succeeds which will be tricky because
we can't keep retrying as that can lead to deadlock. I think if we try
to build this new mode, it will be a lot of effort without equivalent
returns. In common cases, we didn't see that we time out and switch to
serialize mode. It is mostly in cases where PA starts to wait for the
lock acquired by other backend or the machine is slow enough to deal
with the number of parallel apply workers. So, it doesn't seem worth
adding more complexity to the first version but we don't rule out the
possibility of the same in the future if we really see such cases are
common.

[1] - https://www.postgresql.org/message-id/CAD21AoDScLvLT8JBfu5WaGCPQs_qhxsybMT%2BsMXJ%3DQrDMTyr9w%40mail.gmail.com

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL
Next
From: Amit Kapila
Date:
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)