Re: Parallel Apply - Mailing list pgsql-hackers
| From | wenhui qiu |
|---|---|
| Subject | Re: Parallel Apply |
| Date | |
| Msg-id | CAGjGUAJS_qR0O9bgg0xJkoUntxRKJ1W5LoujoGXqYMoAvi4btQ@mail.gmail.com Whole thread Raw |
| In response to | Re: Parallel Apply (Amit Kapila <amit.kapila16@gmail.com>) |
| Responses |
Re: Parallel Apply
|
| List | pgsql-hackers |
Hi
> 1) The way the patch determines dependencies seems to be the "writeset"
> approach from other replication systems (e.g. MySQL does that). Maybe we
> should stick to the same naming?
> OK, I did not research the design in MySQL in detail but will try to analyze it.
I have some documents for mysql parallel apply binlog event.But after MySQL 8.4, only the writeset mode is available. In scenarios with a primary key or unique key, the replica replay is not ordered, but the data is eventually consistent."
https://dev.mysql.com/worklog/task/?id=9556
https://dev.mysql.com/blog-archive/improving-the-parallel-applier-with-writeset-based-dependency-tracking/
https://medium.com/airtable-eng/optimizing-mysql-replication-lag-with-parallel-replication-and-writeset-based-dependency-tracking-1fc405cf023c
Thanks
> 1) The way the patch determines dependencies seems to be the "writeset"
> approach from other replication systems (e.g. MySQL does that). Maybe we
> should stick to the same naming?
> OK, I did not research the design in MySQL in detail but will try to analyze it.
I have some documents for mysql parallel apply binlog event.But after MySQL 8.4, only the writeset mode is available. In scenarios with a primary key or unique key, the replica replay is not ordered, but the data is eventually consistent."
https://dev.mysql.com/worklog/task/?id=9556
https://dev.mysql.com/blog-archive/improving-the-parallel-applier-with-writeset-based-dependency-tracking/
https://medium.com/airtable-eng/optimizing-mysql-replication-lag-with-parallel-replication-and-writeset-based-dependency-tracking-1fc405cf023c
Thanks
On Thu, Nov 20, 2025 at 5:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Nov 20, 2025 at 3:00 AM Tomas Vondra <tomas@vondra.me> wrote:
>
> Hello Kuroda-san,
>
> On 11/18/25 12:00, Hayato Kuroda (Fujitsu) wrote:
> > Dear Amit,
> >
> >> It seems you haven't sent the patch that preserves commit order or the
> >> commit message of the attached patch is wrong. I think the first patch
> >> in series should be the one that preserves commit order and then we
> >> can build a patch that tracks dependencies and allows parallelization
> >> without preserving commit order.
> >
> > I think I attached the correct file. Since we are trying to preserve
> > the commit order by default, everything was merged into one patch.
>
> I agree the goal should be preserving the commit order, unless someone
> can demonstrate (a) clear performance benefits and (b) correctness. It's
> not clear to me how would that deal e.g. with crashes, where some of the
> "future" replicated transactions committed.
>
Yeah, the key challenge in not-preserving commit order is that the
future transactions can be applied when some of the previous
transactions were still in the apply phase and the crash happens. With
the current replication progress tracking scheme, we won't be able to
apply the transactions that were still in-progress when the crash
happened. However, I came up with a scheme to change the replication
progress tracking mechanism to allow out-of-order commits during
apply. See [1] (Replication Progress Tracking). Anyway, as discussed
in this thread, it is better to keep that as optional non-default
behavior, so we want to focus first on preserving the commit-order
part.
Thanks for paying attention, your comments/suggestions are helpful.
[1] - https://www.postgresql.org/message-id/CAA4eK1%2BSEus_6vQay9TF_r4ow%2BE-Q7LYNLfsD78HaOsLSgppxQ%40mail.gmail.com
--
With Regards,
Amit Kapila
pgsql-hackers by date: