Re: Parallel Apply - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Parallel Apply
Date
Msg-id CAFiTN-twfExcQzVs3jhMBpP=VC1jv7J4+OqX8A9LCuJrTCoNcg@mail.gmail.com
Whole thread
In response to RE: Parallel Apply  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
List pgsql-hackers
On Thu, Apr 16, 2026 at 10:29 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Friday, April 17, 2026 12:05 AM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote:
> >
> > On Tuesday, April 14, 2026 9:00 PM Kuroda, Hayato/黒田 隼人
> > <kuroda.hayato@fujitsu.com> wrote:
> > >
> > > Other comments were addressed accordingly, please see attached patch set.
> >
> > I started reviewing patches 0001-0004 myself, aiming to add comments where
> > the design is not straightforward and to identify and fix any clearly incorrect
> > behavior.
> >
> > Here is the updated patch set with the following improvements:
> >
> > * Cosmetic changes in 0001-0004
> > * Additional comments in 0001-0004
> > * Code simplification by merging unnecessary static functions
> > * Removal of function exports left over from the POC version that are no
> >   longer needed
> > * Got rid of XLogRecPtrIsInvalid()
> > * Fixed buggy behavior in partial serialization mode, including:
> >   1) The leader did not serialize the dependency on the last committed
> >      transaction
> >   2) The parallel apply worker could not identify internal messages in
> >      spooled changes
> >   3) An assertion failure in maybe_start_skipping_changes()
> > * Added one test for serialization and restore non-streaming transactions in
> >   0004.
> >
> > Thanks to Kuroda-San for discussing these changes internally with me.

I have started review the design and patches, couple of questions/suggestion

0001:
1. Looking at the commit message and patch, the motivation for
WORKER_INTERNAL_MSG_RELATION isn't very clear to me.  It's clear what
it does, but the motivation isn't very clear to me.

2. +/*
+ * Wait for the given transaction to finish.
+ */
+void
+pa_wait_for_depended_transaction(TransactionId xid)
+{
+ elog(DEBUG1, "wait for depended xid %u", xid);
+
+ for (;;)
+ {
+ /* XXX wait until given transaction is finished */
+ }
+
+ elog(DEBUG1, "finish waiting for depended xid %u", xid);
+}

Does that mean the waiting logic isn't implemented yet?

3.
+ if (c == PqReplMsg_WALData)
+ {
+ /*
+ * Ignore statistics fields that have been updated by the
+ * leader apply worker.
+ *
+ * XXX We can avoid sending the statistics fields from the
+ * leader apply worker but for that, it needs to rebuild the
+ * entire message by removing these fields which could be more
+ * work than simply ignoring these fields in the parallel apply
+ * worker.
+ */
+ s.cursor += SIZE_STATS_MESSAGE;

- apply_dispatch(&s);
+ apply_dispatch(&s);
+ }

I could not understand how this change is relevant to patch 0001. This
patch implements two internal messages; why ignoring statistics fields
for non internal messages is relevant here?

--
Regards,
Dilip Kumar
Google



pgsql-hackers by date:

Previous
From: "Zhijie Hou (Fujitsu)"
Date:
Subject: RE: Parallel Apply
Next
From: Álvaro Herrera
Date:
Subject: Re: Redundant/mis-use of _(x) gettext macro?