Re: Allow ordered partition scans in more cases - Mailing list pgsql-hackers

From David Rowley
Subject Re: Allow ordered partition scans in more cases
Date
Msg-id CAApHDvohAZLQSW4AiHUKmLGNuHYbi0pves+9_9ik3cAYevc2GQ@mail.gmail.com
Whole thread Raw
In response to Re: Allow ordered partition scans in more cases  (Ronan Dunklau <ronan.dunklau@aiven.io>)
Responses Re: Allow ordered partition scans in more cases  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
On Thu, 23 Feb 2023 at 02:10, Ronan Dunklau <ronan.dunklau@aiven.io> wrote:
> I haven't looked too deeply into it, but it seems reasonable that the whole
> sort would cost cheaper than individual sorts on partitions + incremental
> sorts, except when the the whole sort would spill to disk much more than the
> incremental ones. I find it quite difficult to reason about what that threshold
> should be, but I managed to find a case which could fit in a test:

Thanks for coming up with that test case.  It's a little disappointing
to see that so many rows had to be added to get the plan to change. I
wonder if it's really worth testing this particular case. ~1800 rows
is a little more significant than I'd have hoped. The buildfarm has a
few dinosaurs that would likely see a noticeable slowdown from that.

What's on my mind now is if turning 1 Sort into N Sorts is a
particularly good idea from a work_mem standpoint. I see that we don't
do tuplesort_end() until executor shutdown, so that would mean that we
could end up using 1 x work_mem per Sort node.  I idly wondered if we
couldn't do tuplesort_end() after spitting out the final tuple when
EXEC_FLAG_REWIND is not set, but that would still mean we could use N
work_mems when EXEC_FLAG_REWIND *is* set. We only really have
visibility of that during execution too, so can't really make a
decision at plan time based on that.

I'm not quite sure if I'm being overly concerned here or not. All it
would take to get a sort per partition today would be to put a
suitable index on just 1 of the partitions. So this isn't exactly a
new problem, it's just making an old problem perhaps a little more
likely.  The problem does also exist for things like partition-wise
joins too for Hash and Merge joins.  Partition-wise joins are disabled
by default, however.

David



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Testing autovacuum wraparound (including failsafe)
Next
From: Tomas Vondra
Date:
Subject: Re: Missing update of all_hasnulls in BRIN opclasses