On Tue, Sep 6, 2016 at 11:09 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> The big picture here is that you can't only USEMEM() for tapes as the
>> need arises for new tapes as new runs are created. You'll just run a
>> massive availMem deficit, that you have no way of paying back, because
>> you can't "liquidate assets to pay off your creditors" (e.g., release
>> a bit of the memtuples memory). The fact is that memtuples growth
>> doesn't work that way. The memtuples array never shrinks.
>
>
> Hmm. But memtuples is empty, just after we have built the initial runs. Why
> couldn't we shrink, i.e. free and reallocate, it?
After we've built the initial runs, we do in fact give a FREEMEM()
refund to those tapes that were not used within beginmerge(), as I
mentioned just now (with a high workMem, this is often the great
majority of many thousands of logical tapes -- that's how you get to
wasting 8% of 5GB of maintenance_work_mem).
What's at issue with this 500 tapes cap patch is what happens after
tuples are first dumped (after we decide that this is going to be an
external sort -- where we call tuplesort_merge_order() to get the
number of logical tapes in the tapeset), but before the final merge
happens, where we're already doing the right thing for merging by
giving that refund. I want to stop logical allocation (USEMEM()) of an
enormous number of tapes, to make run generation itself able to use
more memory.
It's surprisingly difficult to do something cleverer than just impose a cap.
--
Peter Geoghegan