Home > mailing lists

Re: Parallel Apply - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: Parallel Apply
Date	August 15 09:44:51
Msg-id	CAA4eK1JO=APS8qNrqVh5FAEWx97rZhgPZQSVtDKNZ1O-tjeazw@mail.gmail.com Whole thread Raw
In response to	Re: Parallel Apply (Bruce Momjian <bruce@momjian.us>)
List	pgsql-hackers

Tree view

On Wed, Aug 13, 2025 at 8:57 PM Bruce Momjian <bruce@momjian.us> wrote:
>
> On Wed, Aug 13, 2025 at 09:50:27AM +0530, Amit Kapila wrote:
> > On Tue, Aug 12, 2025 at 10:40 PM Bruce Momjian <bruce@momjian.us> wrote:
> > > > Currently, PostgreSQL supports parallel apply only for large streaming
> > > > transactions (streaming=parallel). This proposal aims to extend
> > > > parallelism to non-streaming transactions, thereby improving
> > > > replication performance in workloads dominated by smaller, frequent
> > > > transactions.
> > >
> > > I thought the approach for improving WAL apply speed, for both binary
> > > and logical, was pipelining:
> > >
> > >         https://en.wikipedia.org/wiki/Instruction_pipelining
> > >
> > > rather than trying to do all the steps in parallel.
> > >
> >
> > It is not clear to me how the speed for a mix of dependent and
> > independent transactions can be improved using the technique you
> > shared as we still need to follow the commit order for dependent
> > transactions. Can you please elaborate more on the high-level idea of
> > how this technique can be used to improve speed for applying logical
> > WAL records?
>
> This blog post from February I think has some good ideas for binary
> replication pipelining:
>
>         https://www.cybertec-postgresql.com/en/end-of-the-road-for-postgresql-streaming-replication/
>
>         Surprisingly, what could be considered the actual replay work
>         seems to be a minority of the total workload.
>

This is the biggest difference between physical and logical WAL apply.
In the case of logical WAL, the actual replay is the majority of the
work. We don't need to read WAL or decode it or find/pin the
appropriate pages to apply. Here, you can consider it is almost
equivalent to how primary receives insert/update/delete from the user.
Firstly, the idea shared in the blog is not applicable for logical
replication and even if we try to somehow map with logical apply, I
don't see how or why it will be able to match up the speed of applying
with multiple workers in case of logical replication. Also, note that
dependency calculation is not as tricky for logical replication as we
can easily retrieve such information from logical WAL records in most
cases.

--
With Regards,
Amit Kapila.

pgsql-hackers by date:

From: David Rowley
Date: 15 August, 09:37:44
Subject: Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

From: Amit Kapila
Date: 15 August, 09:52:32
Subject: Re: Proposal: Conflict log history table for Logical Replication

Re: Parallel Apply - Mailing list pgsql-hackers

Previous

Next