Re: Parallel Apply - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel Apply
Date
Msg-id CAA4eK1JO=APS8qNrqVh5FAEWx97rZhgPZQSVtDKNZ1O-tjeazw@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Apply  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Wed, Aug 13, 2025 at 8:57 PM Bruce Momjian <bruce@momjian.us> wrote:
>
> On Wed, Aug 13, 2025 at 09:50:27AM +0530, Amit Kapila wrote:
> > On Tue, Aug 12, 2025 at 10:40 PM Bruce Momjian <bruce@momjian.us> wrote:
> > > > Currently, PostgreSQL supports parallel apply only for large streaming
> > > > transactions (streaming=parallel). This proposal aims to extend
> > > > parallelism to non-streaming transactions, thereby improving
> > > > replication performance in workloads dominated by smaller, frequent
> > > > transactions.
> > >
> > > I thought the approach for improving WAL apply speed, for both binary
> > > and logical, was pipelining:
> > >
> > >         https://en.wikipedia.org/wiki/Instruction_pipelining
> > >
> > > rather than trying to do all the steps in parallel.
> > >
> >
> > It is not clear to me how the speed for a mix of dependent and
> > independent transactions can be improved using the technique you
> > shared as we still need to follow the commit order for dependent
> > transactions. Can you please elaborate more on the high-level idea of
> > how this technique can be used to improve speed for applying logical
> > WAL records?
>
> This blog post from February I think has some good ideas for binary
> replication pipelining:
>
>         https://www.cybertec-postgresql.com/en/end-of-the-road-for-postgresql-streaming-replication/
>
>         Surprisingly, what could be considered the actual replay work
>         seems to be a minority of the total workload.
>

This is the biggest difference between physical and logical WAL apply.
In the case of logical WAL, the actual replay is the majority of the
work. We don't need to read WAL or decode it or find/pin the
appropriate pages to apply. Here, you can consider it is almost
equivalent to how primary receives insert/update/delete from the user.
Firstly, the idea shared in the blog is not applicable for logical
replication and even if we try to somehow map with logical apply, I
don't see how or why it will be able to match up the speed of applying
with multiple workers in case of logical replication. Also, note that
dependency calculation is not as tricky for logical replication as we
can easily retrieve such information from logical WAL records in most
cases.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options
Next
From: Amit Kapila
Date:
Subject: Re: Proposal: Conflict log history table for Logical Replication