Re: BUG #13530: sort receives "unexpected out-of-memory situation during sort" - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #13530: sort receives "unexpected out-of-memory situation during sort"
Date
Msg-id 15011.1438717384@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #13530: sort receives "unexpected out-of-memory situation during sort"  (brent_despain@selinc.com)
Responses Re: BUG #13530: sort receives "unexpected out-of-memory situation during sort"  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
brent_despain@selinc.com writes:
> The error is from tuplestore.c in the grow_memtuples() function.  It
> probably applies to tuplesort.c as well.

> WARNING: orig_memtupsize: 1024
> WARNING: orig_memtuples_size: 4104
> WARNING: orig_availMem: 2544
> WARNING: newmemtupsize: 1025
> WARNING: new_memtuples_size: 8200
> WARNING: new_availMem: -1552

Ohhh ... I was supposing that this was happening in tuplesort.c.
The issue cannot arise there because sizeof(SortTuple) is at least
16 bytes, so with even the minimum memtupsize of 1024, we are
requesting a chunk of at least 16K; which is more than is needed
to shove palloc into its separate-chunk allocation path, which
has constant chunk allocation overhead as this code expects.

But in tuplestore.c, the array elements are just "void *".  On a 32-bit
machine that's small enough that the very first allocation is a standard
palloc chunk not a separate chunk.  So then the allocation overhead *does*
increase at the first enlargement, and if you're close to exhausting
work_mem then LACKMEM can happen due to that.  The new code that made
sorting run closer to full memory utilization may have increased the odds
of this, but it did not create the problem.

So you need both a 32-bit machine and fairly small work_mem to trigger
this failure; that explains why I could not reproduce it here (I was
using a 64-bit machine).

> I see a few options.
> - Don't check LACKMEM here.  It hardly get checked anywhere else.

Not workable.  If we ignore LACKMEM here, we may have enlarged the array
so much that even dumping all the tuples to disk would not restore
positive availMem; this is the "space management algorithm will go nuts"
hazard referred to in the comment.  (The new array-sizing logic might've
reduced the odds of this, but I doubt it would be safe to assume it can't
happen.)

> - Don't use growth_ratio if the new repalloc request will be <= 8kB.
> - When new repalloc is <= 8kB over allocate allowedMem, increase
> newmemtupsize to fully use new chunk size.

AFAICS neither of these will fix the problem, because the issue is that
the chunk overhead will increase as soon as we enlarge the initial array
request.

I think the best fix really is to increase the initial array size
so that it's at least large enough to force palloc to do what we want.
As a quick hack you could just s/1024/2048/ in tuplestore_begin_common,
but what would be more maintainable in the long run is for the memory
allocation code to export a #define for its "large chunk" threshold,
and make the initial allocation here depend on that.

            regards, tom lane

pgsql-bugs by date:

Previous
From: brent_despain@selinc.com
Date:
Subject: Re: BUG #13530: sort receives "unexpected out-of-memory situation during sort"
Next
From: Tom Lane
Date:
Subject: Re: BUG #13530: sort receives "unexpected out-of-memory situation during sort"