Re: Parallel Apply - Mailing list pgsql-hackers

From wenhui qiu
Subject Re: Parallel Apply
Date
Msg-id CAGjGUAJS_qR0O9bgg0xJkoUntxRKJ1W5LoujoGXqYMoAvi4btQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Apply  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Parallel Apply
List pgsql-hackers
Hi 
> 1) The way the patch determines dependencies seems to be the "writeset"
> approach from other replication systems (e.g. MySQL does that). Maybe we
> should stick to the same naming?

> OK, I did not research the design in MySQL in detail but will try to analyze it.
I have some documents  for mysql parallel apply binlog event.But after MySQL 8.4, only the writeset mode is available. In scenarios with a primary key or unique key, the replica replay is not ordered, but the data is eventually consistent."
https://dev.mysql.com/worklog/task/?id=9556
https://dev.mysql.com/blog-archive/improving-the-parallel-applier-with-writeset-based-dependency-tracking/
https://medium.com/airtable-eng/optimizing-mysql-replication-lag-with-parallel-replication-and-writeset-based-dependency-tracking-1fc405cf023c


Thanks 

On Thu, Nov 20, 2025 at 5:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Nov 20, 2025 at 3:00 AM Tomas Vondra <tomas@vondra.me> wrote:
>
> Hello Kuroda-san,
>
> On 11/18/25 12:00, Hayato Kuroda (Fujitsu) wrote:
> > Dear Amit,
> >
> >> It seems you haven't sent the patch that preserves commit order or the
> >> commit message of the attached patch is wrong. I think the first patch
> >> in series should be the one that preserves commit order and then we
> >> can build a patch that tracks dependencies and allows parallelization
> >> without preserving commit order.
> >
> > I think I attached the correct file. Since we are trying to preserve
> > the commit order by default, everything was merged into one patch.
>
> I agree the goal should be preserving the commit order, unless someone
> can demonstrate (a) clear performance benefits and (b) correctness. It's
> not clear to me how would that deal e.g. with crashes, where some of the
> "future" replicated transactions committed.
>

Yeah, the key challenge in not-preserving commit order is that the
future transactions can be applied when some of the previous
transactions were still in the apply phase and the crash happens. With
the current replication progress tracking scheme, we won't be able to
apply the transactions that were still in-progress when the crash
happened. However, I came up with a scheme to change the replication
progress tracking mechanism to allow out-of-order commits during
apply. See [1] (Replication Progress Tracking). Anyway, as discussed
in this thread, it is better to keep that as optional non-default
behavior, so we want to focus first on preserving the commit-order
part.

Thanks for paying attention, your comments/suggestions are helpful.

[1] - https://www.postgresql.org/message-id/CAA4eK1%2BSEus_6vQay9TF_r4ow%2BE-Q7LYNLfsD78HaOsLSgppxQ%40mail.gmail.com

--
With Regards,
Amit Kapila


pgsql-hackers by date:

Previous
From: Nazir Bilal Yavuz
Date:
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
Next
From: Robert Haas
Date:
Subject: Re: RFC: Logging plan of the running query