Re: Out of Memory errors are frustrating as heck! - Mailing list pgsql-performance
From | Tomas Vondra |
---|---|
Subject | Re: Out of Memory errors are frustrating as heck! |
Date | |
Msg-id | 20190421161434.4hedytsadpbnglgk@development Whole thread Raw |
In response to | Re: Out of Memory errors are frustrating as heck! (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Out of Memory errors are frustrating as heck!
Re: Out of Memory errors are frustrating as heck! |
List | pgsql-performance |
On Sun, Apr 21, 2019 at 10:36:43AM -0400, Tom Lane wrote: >Jeff Janes <jeff.janes@gmail.com> writes: >> The growEnabled stuff only prevents infinite loops. It doesn't prevent >> extreme silliness. > >> If a single 32 bit hash value has enough tuples by itself to not fit in >> work_mem, then it will keep splitting until that value is in a batch by >> itself before shutting off > >Right, that's the code's intention. If that's not good enough for this >case, we'll need to understand the details a bit better before we can >design a better(?) heuristic. > I think we only disable growing when there are no other values in the batch, but that seems rather easy to defeat - all you need is a single tuple with a hash that falls into the same batch, and it's over. Maybe we should make this a bit less accurate - say, if less than 5% memory gets freed, don't add more batches. >I suspect, however, that we might be better off just taking the existence >of the I/O buffers into account somehow while deciding whether it's worth >growing further. That is, I'm imagining adding a second independent >reason for shutting off growEnabled, along the lines of "increasing >nbatch any further will require an unreasonable amount of buffer memory". >The question then becomes how to define "unreasonable". > I think the question the code needs to be asking is "If we double the number of batches, does the amount of memory we need drop?" And the memory needs to account both for the buffers and per-batch data. I don't think we can just stop increasing the number of batches when the memory for BufFile exceeds work_mem, because that entirely ignores the fact that by doing that we force the system to keep the per-batch stuff in memory (and that can be almost arbitrary amount). What I think we should be doing instead is instead make the threshold dynamic - instead of just checking at work_mem, we need to increment the number of batches when the total amount of memory exceeds Max(work_mem, 3 * memory_for_buffiles) This is based on the observation that by increasing the number of batches, we double memory_for_buffiles and split the per-batch data in half. By adding more batches, we'd actually increase the amount of memory used. Of course, this just stops enforcing work_mem at some point, but it at least attempts to minimize the amount of memory used. An alternative would be spilling the extra tuples into a special overflow file, as I explained earlier. That would actually enforce work_mem I think. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-performance by date: