While supporting a customer to increase recovery performance from its
backups i just realized that PostgreSQL never uses big maintenance_work_mem
settings. Even giving 10GB of RAM to maintenance_work_mem results in using
a fraction of memory (it switches to external sort after using around 2
GB). I think the culprit ist the following code in tuplesort.c,
grow_memtuples(), as the comments there let assume already:
/*
* On a 64-bit machine, allowedMem could be high enough to get us
into
* trouble with MaxAllocSize, too.
*/
if ((Size) (state->memtupsize * 2) >= MaxAllocSize /
sizeof(SortTuple))
return false;
While i understand, that doubling the memtuples array is more efficient
than increasing the array in smaller steps, i think we give away usable
memory, because we never consider using memory up to the upper limit given
by MaxAllocSize. Modifying the code in that way results in a sightly better
memory usage, but far away from what the system is able to use on such a
machine (see the diff attached, a very crude experimental code).
I've played around with increasing the MaxAllocSize as well and got the
backend to use up to 6GB maintenance_work_mem during creating an index with
80.000.000 integer tuples. That way the backend was able to sort the tuples
entirely in memory, speeding up the creation of the index from 200s to 80s.
I understand that we have to handle MaxAllocSize very carefully, since it's
involved in many cases in the code. But isn't it worth to special case the
code in grow_memtuples() (and maybe other places where sort is likely to
use more RAM), so that we can remove this constraint on 64-Bit systems with
many RAM built in? Or am I missing something very important?.
--
Thanks
Bernd