Home > mailing lists

Re: [HACKERS] [PATCH] Incremental sort - Mailing list pgsql-hackers

From	Alexander Korotkov
Subject	Re: [HACKERS] [PATCH] Incremental sort
Date	April 9, 2018 18:56:10
Msg-id	CAPpHfdt14t5fsLQ2Pd5m2WQcRZHaHHPNyx9Fko6x_4Uk-Buq4g@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] [PATCH] Incremental sort (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses	Re: [HACKERS] [PATCH] Incremental sort
List	pgsql-hackers

Tree view

On Sat, Apr 7, 2018 at 11:57 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 04/07/2018 06:23 PM, Tom Lane wrote:
> Teodor Sigaev <teodor@sigaev.ru> writes:
>>> I dunno, how would you estimate whether this is actually a win or not?
>>> I don't think our model of sort costs is anywhere near refined enough
>>> or accurate enough to reliably predict whether this is better than
>>> just doing it in one step. Even if the cost model is good, it's not
>>> going to be better than our statistics about the number/size of the
>>> groups in the first column(s), and that's a notoriously unreliable stat.
>
>> I think that improvement in cost calculation of sort should be a
>> separate patch, not directly connected to this one. Postpone patches
>> till other part will be ready to get max improvement for postponed ones
>> doesn't seem to me very good, especially if it suggests some improvement
>> right now.
>
> No, you misunderstand the point of my argument. Without a reasonably
> reliable cost model, this patch could easily make performance *worse*
> not better for many people, due to choosing incremental-sort plans
> where they were really a loss.
>

Yeah. Essentially the patch could push the planner to pick a path that
has low startup cost (and very high total cost), assuming it'll only
need to read small part of the input. But if the number of groups in the
input is low (perhaps just one huge group), that would be a regression.

Yes, I think the biggest risk here is too small number of groups. More

precisely the risk is too large groups while total number of groups might

be large enough.

> If we were at the start of a development cycle and work were being
> promised to be done later in the cycle to improve the planning aspect,
> I'd be more charitable about it. But this isn't merely the end of a
> cycle, it's the *last day*. Now is not the time to commit stuff that
> needs, or even just might need, follow-on work.
>

+1 to that

FWIW I'm willing to spend some time on the patch for PG12, particularly
on the planner / costing part. The potential gains are too interesting

Thank you very much for your efforts on reviewing this patch.

I'm looking forward work with you on this feature for PG12.

FWIW, I think that we're moving this patch in the right direction by

providing separate paths for incremental sort. It's much better than

deciding between full or incremental sort in-place. For sure, considereing

extra paths might cause planning time regression. But I think the

same statement is true about many other planning optimizations.

One thing be can do is to make enable_incrementalsort = off by

default. Then only users who understand improtance of incremental

sort will get extra planning time.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pgsql-hackers by date:

From: Greg Stark
Date: 09 April 2018, 18:29:36
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

From: Robert Haas
Date: 09 April 2018, 19:06:43
Subject: Re: [HACKERS] Runtime Partition Pruning

Re: [HACKERS] [PATCH] Incremental sort - Mailing list pgsql-hackers

Previous

Next