Re: Parallel Apply - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel Apply
Date
Msg-id CAA4eK1+Qpvs65PsFNbewGW+uEcF5VUFo5jqGgQOA1YV317VLFw@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Apply  (Andrei Lepikhov <lepihov@gmail.com>)
List pgsql-hackers
On Thu, Dec 18, 2025 at 2:14 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
>
> On 18/12/25 07:44, Hayato Kuroda (Fujitsu) wrote:
> > Dear Andrei,
> >
> >>> I have been spending time for benchmarking the patch set. Here is an updated
> >>> report.
> >>>
> >> I apologise if my question is incorrect. But what about asynchronous
> >> replication? Does this method help to reduce lag?
> >>
> >> My case is a replica located far from the main instance. There are an
> >> inevitable lag exists. Do your benchmarks provide any insights into the
> >> lag reduction?
> >
> > Yes, ideally parallel apply can reduce the lag, but note that it affects after
> > changes are reached to the subscriber. It may not be so effective if lag is
> > caused by the network. If your transaction is large and you did not enable the
> > streaming option, changing it to 'on' or 'parallel' can improve the lag.
> > It allows to replicate changes before huge transactions are committed.
>
> Sorry if I was inaccurate. I want to understand the scope of this
> feature: what benefit does the code provide to the current master in the
> case of async LR? Of course, it is a prerequisite to enable streaming
> and parallel apply - without these settings, your code is not working,
> is it?
>
> Put aside transaction sizes - it's usually hard to predict. We may think
> about a mix, but it would be enough to benchmark two corner cases - very
> short (single row) and long  (let’s say 10% of a table) transactions to
> be sure we have no degradation.
>
> I just wonder if the main use case for this approach is synchronous
> commit and a good-enough network. Is it correct?
>

It should help async workload as well, the key criteria is that the
apply-worker is not able to deal with load from the publisher.

> >
> >> Or the WALsender process that decodes WAL records from a
> >> hundred actively committing backends, a bottleneck here?
> >
> > Can you clarify your use case bit more? E.g., how many instances subscribe the
> > change from the same publisher. The cheat sheet [1] may be helpful to distinguish
> > the bottleneck.
>
> I keep in mind two cases (For simplicity, let’s imagine we have only one
> publisher-subscriber.):
>
> 1. We have a low-latency network. If we add more and more load to the
> main instance, which process will be the first bottleneck: walsender or
> subscriber?
>

Ideally, it should be subscriber because it has to do more work w.r.t
applying the changes. So, the proposed feature should help these
cases.

> 2. We have a stable load and walsender cope the WAL decoding and fills
> the output socket with transactions. In case latency goes down
> (geographically distributed configuration), may we profit from these new
> changes in parallel apply feature if the network bandwidth is wide enough?
>

I think so. However, it would be helpful if you can measure
performance in such cases either now or once the patch is in bit more
stabilized shape after some cycles of review.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Zsolt Parragi
Date:
Subject: Re: Custom oauth validator options
Next
From: Dilip Kumar
Date:
Subject: Re: Proposal: Conflict log history table for Logical Replication