Thread: Question Regarding Merge Append and Parallel Execution of Index Scans on Partitioned Table
Question Regarding Merge Append and Parallel Execution of Index Scans on Partitioned Table
I had a question regarding the execution of the following query plan on a partitioned table with vector similarity search:
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=398.95..420.97 rows=100 width=12) (actual time=1.750..1.912 rows=100 loops=1)
Output: vector_items.id, ((vector_items.embedding <-> '[0.5,0.5,0.5,0.5,0.5]'::vector(5)))
Buffers: shared hit=2130
-> Merge Append (cost=398.95..22420.04 rows=100000 width=12) (actual time=1.749..1.899 rows=100 loops=1)
Sort Key: ((vector_items.embedding <-> '[0.5,0.5,0.5,0.5,0.5]'::vector(5)))
Buffers: shared hit=2130
-> Index Scan using vector_items_p0_embedding_idx on public.vector_items_p0 vector_items_1 (cost=99.65..5250.52 rows=25126 width=12) (actual time=0.461..0.495 rows=26 loops=1)
Output: vector_items_1.id, (vector_items_1.embedding <-> '[0.5,0.5,0.5,0.5,0.5]'::vector(5))
Order By: (vector_items_1.embedding <-> '[0.5,0.5,0.5,0.5,0.5]'::vector(5))
Buffers: shared hit=525
-> Index Scan using vector_items_p1_embedding_idx on public.vector_items_p1 vector_items_2 (cost=99.68..5223.56 rows=24978 width=12) (actual time=0.420..0.460 rows=29 loops=1)
Output: vector_items_2.id, (vector_items_2.embedding <-> '[0.5,0.5,0.5,0.5,0.5]'::vector(5))
Order By: (vector_items_2.embedding <-> '[0.5,0.5,0.5,0.5,0.5]'::vector(5))
Buffers: shared hit=494
-> Index Scan using vector_items_p2_embedding_idx on public.vector_items_p2 vector_items_3 (cost=99.80..5227.42 rows=24971 width=12) (actual time=0.422..0.454 rows=27 loops=1)
Output: vector_items_3.id, (vector_items_3.embedding <-> '[0.5,0.5,0.5,0.5,0.5]'::vector(5))
Order By: (vector_items_3.embedding <-> '[0.5,0.5,0.5,0.5,0.5]'::vector(5))
Buffers: shared hit=550
-> Index Scan using vector_items_p3_embedding_idx on public.vector_items_p3 vector_items_4 (cost=99.77..5218.50 rows=24925 width=12) (actual time=0.442..0.470 rows=21 loops=1)
Output: vector_items_4.id, (vector_items_4.embedding <-> '[0.5,0.5,0.5,0.5,0.5]'::vector(5))
Order By: (vector_items_4.embedding <-> '[0.5,0.5,0.5,0.5,0.5]'::vector(5))
Buffers: shared hit=561
Planning:
Buffers: shared hit=38
Planning Time: 0.384 ms
Execution Time: 1.975 ms
My main question is:
Are these Index Scans executed sequentially (one after the other as the
Merge Append
requests tuples)?Or are they possibly executed in parallel, in advance, or concurrently in some way?
I understand that Merge Append
requires the first tuple from all child nodes before producing its own first tuple, which can make parallelism difficult. But given that:
Each child index scan already returns sorted output
There are more workers available than the number of partitions
... wouldn't it theoretically be possible to execute these index scans in parallel (one per worker), with the Merge Append
node just merging pre-sorted streams?
Would appreciate any insight into how execution and planning behave in such a case, and whether there are ways to influence or improve this behavior.
Re: Question Regarding Merge Append and Parallel Execution of Index Scans on Partitioned Table
On Thu, 5 Jun 2025 at 07:31, Ayush Vatsa <ayushvatsa1810@gmail.com> wrote: > Are these Index Scans executed sequentially (one after the other as the Merge Append requests tuples)? It's a fairly simple answer: Merge Append does not support parallelism. > Or are they possibly executed in parallel, in advance, or concurrently in some way? It's probably not impossible to do something in a future version of PostgreSQL. We have Gather Merge, which handles executing some sub-nodes and making sure the results get output in the correct order, so it doesn't seem impossible that something could be done for Merge Append. It just does not exist yet. David
Re: Question Regarding Merge Append and Parallel Execution of Index Scans on Partitioned Table
Thanks for the confirmation.
Re: Question Regarding Merge Append and Parallel Execution of Index Scans on Partitioned Table
On Fri, 6 Jun 2025 at 01:47, Ayush Vatsa <ayushvatsa1810@gmail.com> wrote: > A small follow-up question - Gather merge won't gather and merge the > output from child in sorted order, but will always need an explicit Sort > node beneath it to do so. Correct? Incorrect. The input node to the Gather Merge needs to be sorted by something, and the output of the Gather Merge will be sorted by the same thing. Gather Merge handles putting the tuples that were gathered by parallel workers back into the order they're meant to be in. The Gather Merge's input node could be anything that provides sorted output. Index Scan, Merge Join, Nested Loop, Group Aggregate, etc. Have a look at [1] David [1] https://github.com/postgres/postgres/blob/master/src/test/regress/expected/select_parallel.out#L713
Re: Question Regarding Merge Append and Parallel Execution of Index Scans on Partitioned Table
> something, and the output of the Gather Merge will be sorted by the
> same thing.
Re: Question Regarding Merge Append and Parallel Execution of Index Scans on Partitioned Table
On Fri, 6 Jun 2025 at 07:49, Ayush Vatsa <ayushvatsa1810@gmail.com> wrote: > That said, I’m wondering if this might not be necessary, given that > Gather Merge already accomplishes similar functionality. Would > love to hear your thoughts on whether there’s a distinct advantage to > adding parallelism to Merge Append or if Gather Merge sufficiently > covers all the use cases. Gather Merge supports only a single subnode. Merge Append supports any number. David