Re: MemoryContextAllocHuge(): selectively bypassing MaxAllocSize - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: MemoryContextAllocHuge(): selectively bypassing MaxAllocSize
Date
Msg-id CAFj8pRAk5GgsSer+ZNkRz9PU1xiDD2_nnOMPuESJAUpJyjkp-Q@mail.gmail.com
Whole thread Raw
In response to MemoryContextAllocHuge(): selectively bypassing MaxAllocSize  (Noah Misch <noah@leadboat.com>)
List pgsql-hackers
<p dir="ltr">+1<p dir="ltr">Pavel<div class="gmail_quote">Dne 13.5.2013 16:29 "Noah Misch" <<a
href="mailto:noah@leadboat.com">noah@leadboat.com</a>>napsal(a):<br type="attribution" /><blockquote
class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> A memory chunk allocated
throughthe existing palloc.h interfaces is limited<br /> to MaxAllocSize (~1 GiB).  This is best for most callers;
SET_VARSIZE()need<br /> not check its own 1 GiB limit, and algorithms that grow a buffer by doubling<br /> need not
checkfor overflow.  However, a handful of callers are quite happy to<br /> navigate those hazards in exchange for the
abilityto allocate a larger chunk.<br /><br /> This patch introduces MemoryContextAllocHuge() and repalloc_huge() that
check<br/> a higher MaxAllocHugeSize limit of SIZE_MAX/2.  Chunks don't bother recording<br /> whether they were
allocatedas huge; one can start with palloc() and then<br /> repalloc_huge() to grow the value.  To demonstrate, I put
thisto use in<br /> tuplesort.c; the patch also updates tuplestore.c to keep them similar.  Here's<br /> the trace_sort
frombuilding the pgbench_accounts primary key at scale factor<br /> 7500, maintenance_work_mem = '56GB'; memtuples
itselfconsumed 17.2 GiB:<br /><br /> LOG:  internal sort ended, 48603324 KB used: CPU 75.65s/305.46u sec elapsed 391.21
sec<br/><br /> Compare:<br /><br /> LOG:  external sort ended, 1832846 disk blocks used: CPU 77.45s/988.11u sec elapsed
1146.05sec<br /><br /> This was made easier by tuplesort growth algorithm improvements in commit<br />
8ae35e91807508872cabd3b0e8db35fc78e194ac. The problem has come up before<br /> (TODO item "Allow sorts to use more
availablememory"), and Tom floated the<br /> idea[1] behind the approach I've used.  The next limit faced by sorts
is<br/> INT_MAX concurrent tuples in memory, which limits helpful work_mem to about<br /> 150 GiB when sorting int4.<br
/><br/> I have not added variants like palloc_huge() and palloc0_huge(), and I have<br /> not added to the frontend
palloc.hinterface.  There's no particular barrier<br /> to doing any of that.  I don't expect more than a dozen or so
callers,so most<br /> of the variations might go unused.<br /><br /> The comment at MaxAllocSize said that aset.c
expectsdoubling the size of an<br /> arbitrary allocation to never overflow, but I couldn't find the code in<br />
question. AllocSetAlloc() does double sizes of blocks used to aggregate small<br /> allocations, so maxBlockSize had
betterstay under SIZE_MAX/2.  Nonetheless,<br /> that expectation does apply to dozens of repalloc() users outside
aset.c,and<br /> I preserved it for repalloc_huge().  64-bit builds will never notice, and I<br /> won't cry for the
resulting2 GiB limit on 32-bit.<br /><br /> Thanks,<br /> nm<br /><br /> [1] <a
href="http://www.postgresql.org/message-id/19908.1297696263@sss.pgh.pa.us"
target="_blank">http://www.postgresql.org/message-id/19908.1297696263@sss.pgh.pa.us</a><br/><br /> --<br /> Noah
Misch<br/> EnterpriseDB                                 <a href="http://www.enterprisedb.com"
target="_blank">http://www.enterprisedb.com</a><br/><br /><br /> --<br /> Sent via pgsql-hackers mailing list (<a
href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br/> To make changes to your
subscription:<br/><a href="http://www.postgresql.org/mailpref/pgsql-hackers"
target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/><br /></blockquote></div> 

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: lock support for aarch64
Next
From: Robins Tharakan
Date:
Subject: Re: Add more regression tests for dbcommands