Re: Out of Memory errors are frustrating as heck! - Mailing list pgsql-performance

From Tomas Vondra
Subject Re: Out of Memory errors are frustrating as heck!
Date
Msg-id 20190421161434.4hedytsadpbnglgk@development
Whole thread Raw
In response to Re: Out of Memory errors are frustrating as heck!  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Out of Memory errors are frustrating as heck!
Re: Out of Memory errors are frustrating as heck!
List pgsql-performance
On Sun, Apr 21, 2019 at 10:36:43AM -0400, Tom Lane wrote:
>Jeff Janes <jeff.janes@gmail.com> writes:
>> The growEnabled stuff only prevents infinite loops.  It doesn't prevent
>> extreme silliness.
>
>> If a single 32 bit hash value has enough tuples by itself to not fit in
>> work_mem, then it will keep splitting until that value is in a batch by
>> itself before shutting off
>
>Right, that's the code's intention.  If that's not good enough for this
>case, we'll need to understand the details a bit better before we can
>design a better(?) heuristic.
>

I think we only disable growing when there are no other values in the
batch, but that seems rather easy to defeat - all you need is a single
tuple with a hash that falls into the same batch, and it's over. Maybe
we should make this a bit less accurate - say, if less than 5% memory
gets freed, don't add more batches.

>I suspect, however, that we might be better off just taking the existence
>of the I/O buffers into account somehow while deciding whether it's worth
>growing further.  That is, I'm imagining adding a second independent
>reason for shutting off growEnabled, along the lines of "increasing
>nbatch any further will require an unreasonable amount of buffer memory".
>The question then becomes how to define "unreasonable".
>

I think the question the code needs to be asking is "If we double the
number of batches, does the amount of memory we need drop?" And the
memory needs to account both for the buffers and per-batch data.

I don't think we can just stop increasing the number of batches when the
memory for BufFile exceeds work_mem, because that entirely ignores the
fact that by doing that we force the system to keep the per-batch stuff
in memory (and that can be almost arbitrary amount).

What I think we should be doing instead is instead make the threshold
dynamic - instead of just checking at work_mem, we need to increment the
number of batches when the total amount of memory exceeds

    Max(work_mem, 3 * memory_for_buffiles)

This is based on the observation that by increasing the number of
batches, we double memory_for_buffiles and split the per-batch data in
half. By adding more batches, we'd actually increase the amount of
memory used.

Of course, this just stops enforcing work_mem at some point, but it at
least attempts to minimize the amount of memory used.

An alternative would be spilling the extra tuples into a special
overflow file, as I explained earlier. That would actually enforce
work_mem I think.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Out of Memory errors are frustrating as heck!
Next
From: Tomas Vondra
Date:
Subject: Re: Backup and Restore (pg_dump & pg_restore)