Home > mailing lists

Re: Support loser tree for k-way merge - Mailing list pgsql-hackers

From	John Naylor
Subject	Re: Support loser tree for k-way merge
Date	December 4 08:00:10
Msg-id	CANWCAZYKgc4xFbBXtePw_XY+a+Ao_-CxZ+Z8YWSrK2B1HqtdWg@mail.gmail.com Whole thread Raw
In response to	Re: Support loser tree for k-way merge (Sami Imseih <samimseih@gmail.com>)
Responses	Re: Support loser tree for k-way merge
List	pgsql-hackers

Tree view

On Thu, Dec 4, 2025 at 1:14 AM Sami Imseih <samimseih@gmail.com> wrote:
> Can we drive the decision for what to do based on optimizer
> stats, i.e. n_distinct and row counts? Not sure what the calculation would
> be specifically, but something else to consider.

It's happened multiple times before that someone proposes a change
that makes sorting faster on some inputs, but turns out to regress on
low cardinality (I've done it myself). It seems to be pretty hard not
to regress that case. Occasionally the author proposes to take
optimizer stats into account, and that was rejected because
cardinality stats are often wildly wrong.

Further, underestimation is far more common than overestimation, in
which case IIUC the planner would just continue to choose the existing
heap method.

> We can still provide the GUC to  override the optimizer decisions,
> but at least the optimizer, given up-to-date stats, may get it right most
> of the time.

I don't have much faith that people will properly set a GUC whose
effects depends on the input characteristics and memory settings.

The new method might be a better overall trade-off, but we'd need some
more comprehensive measurements to know what we're dealing with.

--
John Naylor
Amazon Web Services

pgsql-hackers by date:

From: shveta malik
Date: 04 December, 07:41:44
Subject: Re: Skipping schema changes in publication

From: Chao Li
Date: 04 December, 08:10:33
Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)

Re: Support loser tree for k-way merge - Mailing list pgsql-hackers

Previous

Next