Re: sort_mem sizing (Non-linear Performance) - Mailing list pgsql-general
From | Tom Lane |
---|---|
Subject | Re: sort_mem sizing (Non-linear Performance) |
Date | |
Msg-id | 7213.1023654940@sss.pgh.pa.us Whole thread Raw |
In response to | Re: sort_mem sizing (Non-linear Performance) (Curt Sampson <cjs@cynic.net>) |
Responses |
Re: sort_mem sizing (Non-linear Performance)
|
List | pgsql-general |
Curt Sampson <cjs@cynic.net> writes: > On Fri, 31 May 2002, Tom Lane wrote: >> Curt Sampson <cjs@cynic.net> writes: > Next I tried to bump the sortmem up to 256 MB, half the physical RAM > of the machine. I found out the hard way that the back-end doing the > indexing will grow to something over three times the size of sortmem, > and proceeded (slowly) to extricate myself from swap hell. >> >> [ scratches head ... ] The sort code tries very hard to keep track >> of how much it's storing. I wonder why it's so far off? Perhaps >> failing to allow for palloc overhead? Remind me again of the exact >> datatype you were indexing, please. > It's just an int. Hm. The actual size of an IndexTuple with an integer datum would be 12 bytes, which is how the tuplesort.c code would account for it. But the palloc allocation to hold it would be 16 data bytes plus the AllocChunkData header overhead, which could be 8, 12, or 16 bytes depending on whether you have MEMORY_CONTEXT_CHECKING enabled (did you configure --enable-cassert?) and on whether MAXALIGN is 4 or 8 bytes on your hardware. So the palloc overhead could indeed amount to a factor of nearly 3x. While looking into this, I came across the following commentary in tuplesort.c: * NOTES about memory consumption calculations: * * We count space requested for tuples against the SortMem limit. * Fixed-size space (primarily the LogicalTapeSet I/O buffers) is not * counted, nor do we count the variable-size memtuples and memtupindex * arrays. (Even though those could grow pretty large, they should be * small compared to the tuples proper, so this is not unreasonable.) * * The major deficiency in this approach is that it ignores palloc overhead. * The memory space actually allocated for a palloc chunk is always more * than the request size, and could be considerably more (as much as 2X * larger, in the current aset.c implementation). So the space used could * be considerably more than SortMem says. * * One way to fix this is to add a memory management function that, given * a pointer to a palloc'd chunk, returns the actual space consumed by the * chunk. This would be very easy in the current aset.c module, but I'm * hesitant to do it because it might be unpleasant to support in future * implementations of memory management. (For example, a direct * implementation of palloc as malloc could not support such a function * portably.) * * A cruder answer is just to apply a fudge factor, say by initializing * availMem to only three-quarters of what SortMem indicates. This is * probably the right answer if anyone complains that SortMem is not being * obeyed very faithfully. This note, and the code too, was written some years ago by yours truly; apparently you're the first to complain about the inaccuracy. I have changed my mind since then: I now think that adding an actual-space inquiry function to the memory management API would be a reasonable thing to do. We're depending on enough other stuff that isn't provided by plain malloc() that one more addition won't make any difference. Also, the two variable-size arrays mentioned above would add another 8 bytes per tuple, which really isn't negligible compared to 12-byte tuples. So that might account for the rest of your 3x growth observation. Probably tuplesort needs to include those in its space calculation. If we did make these changes then I'd be inclined to tweak the default SortMem up from 512K to say 1024K; otherwise we'd effectively be reducing the default sort size because of the extra space being charged for. Comments? BTW, I'm not committing to get this done for 7.3... regards, tom lane
pgsql-general by date: