Re: MergeAppend could consider sorting cheapest child path - Mailing list pgsql-hackers

From David Rowley
Subject Re: MergeAppend could consider sorting cheapest child path
Date
Msg-id CAApHDvqx2VsjUr-ho+szxjM-=7DBp9HnR3+_SzSTT2Fr+xYKcA@mail.gmail.com
Whole thread Raw
In response to Re: MergeAppend could consider sorting cheapest child path  (Andrei Lepikhov <lepihov@gmail.com>)
List pgsql-hackers
On Wed, 15 Oct 2025 at 22:26, Andrei Lepikhov <lepihov@gmail.com> wrote:
> This patch originated from the practice of how table partitioning can
> severely impact query execution. It is a rare case in our experience
> when all partitions have symmetrical indexes, especially those located
> remotely. People adopt a set of indexes according to the current load
> profile on hot partitions. In fact, it is a typical case when a
> timestamp orders partitions, and most of them are rarely updated.
>
> So, the goal is to use MergeAppend when only a few partitions lack a
> proper index.

hmm... doesn't that already work?

create table hp (a int ) partition by hash(a);
create table hp0 partition of hp for values with(modulus 2, remainder 0);
create table hp1 partition of hp for values with(modulus 2, remainder 1);
create index on hp0(a);

explain (costs off) select * from hp order by a;

 Merge Append
   Sort Key: hp.a
   ->  Index Only Scan using hp0_a_idx on hp0 hp_1
   ->  Sort
         Sort Key: hp_2.a
         ->  Seq Scan on hp1 hp_2

Or is this a case of that you want to also consider Seq Scan on hp0 ->
Sort if it's cheaper than Index Scan on hp0_a_idx just in case that's
enough to make Merge Append cheap enough to beat Append -> Sort?


> The concern about memory consumption makes sense, of course. However, we
> choose to sort based on cost estimations that usually work when the
> optimiser decides between fetching many tuples during an Index Scan,
> compared to only a few tuples to fetch with subsequent sorting.
> Additionally, scan estimation typically yields good predictions
> (compared to JOIN), and I personally estimate the OOM risk to be low.
>
> Additionally, this patch revealed an issue with the cost model: there is
> no significant difference between a single massive Sort and multiple
> sorts followed by MergeAppend. Our experiments show that it is incorrect
> (one Sort operator demonstrates more efficacy) and may be corrected.

Do you mean "no significant difference [in the costings] between"?

Not sure if I follow you here. You've said "one Sort operator
demonstrates more efficacy", do you mean Sort atop of Append is
better? If so, why does the patch try to encourage plans with Merge
Append with many Sorts?

David



pgsql-hackers by date:

Previous
From: jian he
Date:
Subject: Re: Add SPLIT PARTITION/MERGE PARTITIONS commands
Next
From: Sadhuprasad Patro
Date:
Subject: Re: Improved TAP tests by replacing sub-optimal uses of ok() with better Test::More functions