On 04/07/2018 06:23 PM, Tom Lane wrote: > Teodor Sigaev <teodor@sigaev.ru> writes: >>> I dunno, how would you estimate whether this is actually a win or not? >>> I don't think our model of sort costs is anywhere near refined enough >>> or accurate enough to reliably predict whether this is better than >>> just doing it in one step. Even if the cost model is good, it's not >>> going to be better than our statistics about the number/size of the >>> groups in the first column(s), and that's a notoriously unreliable stat. > >> I think that improvement in cost calculation of sort should be a >> separate patch, not directly connected to this one. Postpone patches >> till other part will be ready to get max improvement for postponed ones >> doesn't seem to me very good, especially if it suggests some improvement >> right now. > > No, you misunderstand the point of my argument. Without a reasonably > reliable cost model, this patch could easily make performance *worse* > not better for many people, due to choosing incremental-sort plans > where they were really a loss. >
Yeah. Essentially the patch could push the planner to pick a path that has low startup cost (and very high total cost), assuming it'll only need to read small part of the input. But if the number of groups in the input is low (perhaps just one huge group), that would be a regression.
Yes, I think the biggest risk here is too small number of groups. More
precisely the risk is too large groups while total number of groups might
be large enough.
> If we were at the start of a development cycle and work were being > promised to be done later in the cycle to improve the planning aspect, > I'd be more charitable about it. But this isn't merely the end of a > cycle, it's the *last day*. Now is not the time to commit stuff that > needs, or even just might need, follow-on work. >
+1 to that
FWIW I'm willing to spend some time on the patch for PG12, particularly on the planner / costing part. The potential gains are too interesting
Thank you very much for your efforts on reviewing this patch.
I'm looking forward work with you on this feature for PG12.
FWIW, I think that we're moving this patch in the right direction by
providing separate paths for incremental sort. It's much better than
deciding between full or incremental sort in-place. For sure, considereing
extra paths might cause planning time regression. But I think the
same statement is true about many other planning optimizations.
One thing be can do is to make enable_incrementalsort = off by
default. Then only users who understand improtance of incremental