Re: Asynchronous MergeAppend - Mailing list pgsql-hackers

From Matheus Alcantara
Subject Re: Asynchronous MergeAppend
Date
Msg-id DHMH23M7UOFS.12W6PUDI1I3NH@gmail.com
Whole thread Raw
In response to Re: Asynchronous MergeAppend  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: Asynchronous MergeAppend
List pgsql-hackers
On Sat Apr 4, 2026 at 11:24 PM -03, Alexander Korotkov wrote:
> I made more work on the patchset.
>
> Patch #1 now considers IncrementalSort as exclusion alongside with
> Sort.  Exclusion check is now on the top of the switch().
> Patch #2 is split into 3 patches: common structures, common sync
> append logic, and common async append logic.
> New structs are now named AppendBase/AppendBaseState, corresponding
> fields are "ab" and "as".
>
> Most importantly I noted that this patchset actually only makes
> initial heap filling asynchronous.  The steady work after that is
> still syncnronous.  Even that it used async infrastructure, it fetched
> tuples from children subplans one-by-one: effectively synchronous but
> paying for asynchronous infrastructure.  I think even with this
> limitation, this patchset is valuable: the startup cost for children
> foreignscans can be high.  But this understanding allowed me to
> significantly simplify the main patch including:
> 1) After initial heap filling, use ExecProcNode() to fetch from children plans.
> 2) Remove ms_has_asyncresults entirely. Async responses store directly
> into ms_slots[] (the existing heap slot array), which serves as both
> the merge state and the "result arrived" indicator via TupIsNull().
> 3) Removed needrequest usage from MergeAppend. Since MergeAppend only
> fires initial requests (via ExecAppendBaseAsyncBegin()) and never
> sends follow-up requests, needrequest tracking is unnecessary.
> ExecMergeAppendAsyncRequest() was eliminated entirely.
> 4)  ExecMergeAppendAsyncGetNext() reduced to a simple wait loop:
> 5)  asyncresults allocation reduced back to nasyncplans.  MergeAppend
> doesn't use it (stores in ms_slots), and Append only needs nasyncplans
> entries for its stack.
>
> Additionally, I made the following changes.
> 1) WAIT_EVENT_MERGE_APPEND_READY wait event instead of extending
> WAIT_EVENT_APPEND_READY.  That should be less confusing for monitoring
> purposes.
> 2) More tests: error handling with broken partition, plan-time
> partition pruning, and run-time partition pruning tests for async
> MergeAppend.
>

Thanks for v17, the split of 0002 into multiple patches seems much
better. Overall I agree with the changes on v17 compared with v16, the
removal of ms_has_asyncresults makes the code better to read and follow.
The separate WAIT_EVENT_MERGE_APPEND_READY for better monitoring is also
good, I've tested some long running queries and I've find the event on
pg_stat_activity. The steady work changes also looks good.

One minor issue on 0002:

+    bool        valid_subplans_identified;    /* is valid_asyncplans valid? */
+    Bitmapset  *valid_subplans;
+    Bitmapset  *valid_asyncplans;    /* valid asynchronous plans indexes */

Should be /* is valid_subplans valid? */

-----

Minor comment on 0005:

+static void
+ExecMergeAppendAsyncGetNext(MergeAppendState *node, int mplan)
+{
+    /*
+     * All initial async requests were fired by ExecAppendBaseAsyncBegin.

Wondering if we should reference ExecMergeAppendAsyncBegin() instead of
ExecAppendBaseAsyncBegin() since this is on nodeMergeAppend, what do you
think?

--
Matheus Alcantara
EDB: https://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Zsolt Parragi
Date:
Subject: Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?
Next
From: Mihail Nikalayeu
Date:
Subject: Re: Adding REPACK [concurrently]