Re: Parallel Apply - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel Apply
Date
Msg-id CAA4eK1KbSOcU2FER=F_nd0ghSeHdGeT=4U4n=dJTRPyCM7ezBA@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Apply  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Mon, Nov 24, 2025 at 9:56 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Sep 16, 2025 at 3:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, Sep 6, 2025 at 10:33 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> > > I suspect this might not be the most performant default strategy and
> > > could frequently cause a performance dip. In general, we utilize
> > > parallel apply workers, considering that the time taken to apply
> > > changes is much costlier than reading and sending messages to workers.
> > >
> > > The current strategy involves the leader picking one transaction for
> > > itself after distributing transactions to all apply workers, assuming
> > > the apply task will take some time to complete. When the leader takes
> > > on an apply task, it becomes a bottleneck for complete parallelism.
> > > This is because it needs to finish applying previous messages before
> > > accepting any new ones. Consequently, even as workers slowly become
> > > free, they won't receive new tasks because the leader is busy applying
> > > its own transaction.
> > >
> > > This type of strategy might be suitable in scenarios where users
> > > cannot supply more workers due to resource limitations. However, on
> > > high-end machines, it is more efficient to let the leader act solely
> > > as a message transmitter and allow the apply workers to handle all
> > > apply tasks. This could be a configurable parameter, determining
> > > whether the leader also participates in applying changes. I believe
> > > this should not be the default strategy; in fact, the default should
> > > be for the leader to act purely as a transmitter.
> > >
> >
> > I see your point but consider a scenario where we have two pa workers.
> > pa-1 is waiting for some backend on unique_key insertion and pa-2 is
> > waiting for pa-1 to complete its transaction as pa-2 has to perform
> > some change which is dependent on pa-1's transaction. So, leader can
> > either simply wait for a third transaction to be distributed or just
> > apply it and process another change. If we follow the earlier then it
> > is quite possible that the sender fills the network queue to send data
> > and simply timed out.
>
> Sorry I took a while to come back to this. I understand your point and
> agree that it's a valid concern. However, I question whether limiting
> this to a single choice is the optimal solution. The core issue
> involves two distinct roles: work distribution and applying changes.
> Work distribution is exclusively handled by the leader, while any
> worker can apply the changes. This is essentially a single-producer,
> multiple-consumer problem.
>
> While it might seem efficient for the producer (leader) to assist
> consumers (workers) when there's a limited number of consumers, I
> believe this isn't the best design. In such scenarios, it's generally
> better to allow the producer to focus solely on its primary task,
> unless there's a severe shortage of processing power.
>
> If computing resources are constrained, allowing producers to join
> consumers in applying changes is acceptable. However, if sufficient
> processing power is available, the producer should ideally be left to
> its own duties. The question then becomes: how do we make this
> decision?
>
> My suggestion is to make this a configurable parameter. Users could
> then decide whether the leader participates in applying changes.
>

We could do this but another possibility is that the leader does
distribute some threshold of pending transactions (say 5 or 10) to
each of the workers and if none of the workers is still available then
it can perform the task by itself. I think this will avoid the system
performing poorly when the existing workers are waiting on each other
and or backend to finish the current transaction. Having said that, I
think this can be done as a separate optimization patch as well.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Dave Page
Date:
Subject: Re: Build failure with Meson >= 1.8.3 on Windows
Next
From: Andrey Borodin
Date:
Subject: Re: IPC/MultixactCreation on the Standby server