I wrote:
> But in tuplestore.c, the array elements are just "void *". On a 32-bit
> machine that's small enough that the very first allocation is a standard
> palloc chunk not a separate chunk. So then the allocation overhead *does*
> increase at the first enlargement, and if you're close to exhausting
> work_mem then LACKMEM can happen due to that. The new code that made
> sorting run closer to full memory utilization may have increased the odds
> of this, but it did not create the problem.
> So you need both a 32-bit machine and fairly small work_mem to trigger
> this failure; that explains why I could not reproduce it here (I was
> using a 64-bit machine).
Actually that's not true; the problem can happen on 64-bit as well.
I was thinking that the behavior change occurred at 8K allocation
requests, but actually it happens for requests larger than 8K.
> I think the best fix really is to increase the initial array size
> so that it's at least large enough to force palloc to do what we want.
> As a quick hack you could just s/1024/2048/ in tuplestore_begin_common,
> but what would be more maintainable in the long run is for the memory
> allocation code to export a #define for its "large chunk" threshold,
> and make the initial allocation here depend on that.
If you want to use the "official" patch for this, see
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8bd45a394958c3fd7400654439ef2a113043f8f5
regards, tom lane