"Jim C. Nasby" <decibel@decibel.org> writes:
> Since the planner knows how many rows will
> be going into the sort and how wide they are, ISTM it should be able to
> estimate how much memory will be needed.
... which is different from how much will be available. See cost_sort():
* If the total volume of data to sort is less than work_mem, we will do
* an in-memory sort, which requires no I/O and about t*log2(t) tuple
* comparisons for t tuples.
*
* If the total volume exceeds work_mem, we switch to a tape-style merge
* algorithm. There will still be about t*log2(t) tuple comparisons in
* total, but we will also need to write and read each tuple once per
* merge pass. We expect about ceil(log6(r)) merge passes where r is the
* number of initial runs formed (log6 because tuplesort.c uses six-tape
* merging). Since the average initial run should be about twice work_mem,
* we have
* disk traffic = 2 * relsize * ceil(log6(p / (2*work_mem)))
* cpu = comparison_cost * t * log2(t)
The actual cost of a sort is therefore *highly* sensitive to how much
memory it is allowed to use. Blowing off any ability to estimate this
in advance is not going to improve matters...
regards, tom lane