Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers

From James Coleman
Subject Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date
Msg-id CAAaqYe--jV9pHrX5NRw9R2QGG+nRuPPKguzHhKZwhp6XPOiyuQ@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (James Coleman <jtc331@gmail.com>)
Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (James Coleman <jtc331@gmail.com>)
List pgsql-hackers
On Thu, Apr 2, 2020 at 8:20 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
> Hi,
>
> Thanks, the v52 looks almost ready. I've been looking at the two or
> three things I mentioned, and I have a couple of comments.
>
>
> 1) /* XXX comparison_cost shouldn't be 0? */
>
> I'm not worried about this, because this is not really intriduced by
> this patch - create_sort_path has the same comment/issue, so I think
> it's acceptable to do the same thing for incremental sort.

Sounds good.

> 2) INITIAL_MEMTUPSIZE
>
> tuplesort.c does this:
>
>    #define INITIAL_MEMTUPSIZE Max(1024, \
>        ALLOCSET_SEPARATE_THRESHOLD / sizeof(SortTuple) + 1)
>
> supposedly to keep the array size within ALLOCSET_SEPARATE_THRESHOLD.
> But I think it fails to do that, for a couple of reasons.
>
> Firstly, ALLOCSET_SEPARATE_THRESHOLD is 8192, and SortTuple is 21B at
> minimum (without padding), so with 1024 elements it's guaranteed to be
> at least 21kB - so exceeding the threshold. The maximum value is
> something like 256.
>
> Secondly, the second part of the formula is guaranteed to get us over
> the threshold anyway, thanks to the +1. Let's say SortTuple is 30B. Then
>
>    ALLOCSET_SEPARATE_THRESHOLD / 30 = 273
>
> but we end up using 274, resulting in 8220B array. :-(
>
> So I guess the formula should be more like
>
>    Min(128, ALLOCSET_SEPARATE_THRESHOLD / sizeof(SortTuple))
>
> or something like that.
>
> FWIW I think the whole hypothesis that selecting the array size below
> ALLOCSET_SEPARATE_THRESHOLD reduces overhead is dubious. AFAIC we
> allocate this only once (or very few times), and if we need to grow the
> array we'll hit the threshold anyway.
>
> I'd just pick a reasonable constant - 128 or 256 seems reasonable, 1024
> may be OK too.

That was a part of the patch I haven't touched since I inherited it,
and I didn't feel like I knew enough about the Postgres memory
management to make a determination on whether the reasoning made
sense.

So I' happy to use a constant as suggested.

> 3) add_partial_path
>
> I'm a bit torn about using enable_incrementalsort in add_partial_path. I
> know we've done that to eliminate the overhead, but it just seems weird.
> I wonder if that really saves us anything or if it was just noise. I'll
> do more testing before making a decision.
>
> While looking at add_path, I see the comparisons there are made in the
> opposite order - we first compare costs, and only then pathkeys. That
> seems like a more efficient way, I guess (cheaper float comparison
> first, pathkey comparison with a loop second). I wonder why it's not
> done like that in add_partial_path too? Perhaps this will make it cheap
> enough to remove the enable_incrementalsort check.

I would love to avoid checking enable_incrementalsort there. I think
it's pretty gross to do so. But if it's the only way to keep the patch
acceptable, then obviously I'd leave it in. So I'm hopeful we can
avoid needing it.

> 4) add_partial_path_precheck
>
> While looking at add_partial_path, I realized add_partial_path_precheck
> probably needs the same change - to consider startup cost too. I think
> the expectation is that add_partial_path_precheck only rejects paths
> that we know would then get rejected by add_partial_path.
>
> But now the behavior is inconsistent, which might lead to surprising
> behavior (I haven't seen such cases, but I haven't looked for them).
> At the moment the add_partial_path_precheck is called only from
> joinpath.c, maybe it's not an issue there.
>
> If it's OK to keep it like this, it'd probably deserve a comment why
> the difference is OK. And the comments also contain the same claim that
> parallel plans only need to look at total cost.

I remember some discussion about that much earlier in this thread (or
in the spin-off thread [1] re: that specific change that didn't get a
lot of action), and I'm pretty sure the conclusion was that we should
change both. But I guess we never got around to doing it.

> 5) Overall, I think the costing is OK. I'm sure we'll find cases that
> will need improvements, but that's fine. However, we now have
>
> - cost_tuplesort (used to be cost_sort)
> - cost_full_sort
> - cost_incremental_sort
> - cost_sort
>
> I find it a bit confusing that we have cost_sort and cost_full_sort. Why
> don't we just keep using the dummy path in label_sort_with_costsize?
> That seems to be the only external caller outside costsize.c. Then we
> could either make cost_full_sort static or get rid of it entirely.

This another area of the patch I haven't really modified.

James

[1]: https://www.postgresql.org/message-id/flat/CAAaqYe-5HmM4ih6FWp2RNV9rruunfrFrLhqFXF_nrrNCPy1Zhg%40mail.gmail.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: src/test/regress/standby_schedule
Next
From: James Coleman
Date:
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)