Re: Add the ability to limit the amount of memory that can be allocated to backends. - Mailing list pgsql-hackers

From James Hunter
Subject Re: Add the ability to limit the amount of memory that can be allocated to backends.
Date
Msg-id CAJVSvF52zqHaDE7y7HANUqHm7SkN+MVSM9XBGkXm_v9KuOMZZQ@mail.gmail.com
Whole thread Raw
In response to Re: Add the ability to limit the amount of memory that can be allocated to backends.  (Tomas Vondra <tomas@vondra.me>)
List pgsql-hackers
On Sat, Dec 28, 2024 at 6:57 AM Tomas Vondra <tomas@vondra.me> wrote:
>
> On 12/28/24 13:36, Anton A. Melnikov wrote:
> >
> > ... In more details let me suggest
> > the following steps or parts:
> > 1) realize memory limitation for a separate backend independent from the
> > work_mem GUC;
> > 2) add workmem-like limit that would "adjust" the work_mem values used by
> > the optimize as Thomas suggested;
> > 3) add global limit for all backends.
> >
> > As for p.1 there is a patch that was originally suggested by my colleague
> > Maxim Orlov <orlovmg@gmail.com> and which i modified for the current
> > master.
> > This patch introduces the only max_backend_memory GUC that specifies
> > the maximum amount of memory that can be allocated to a backend.
> > ...
> > If the allocated memory size is exceeded, a standard "out of memory"
> > error will be issued.

Yes this seems like a reasonable plan. I would add that (2) might be
good enough to solve (1), without needing to emit an "out of memory"
error.

> Not sure a simple memory limit like in the patch (which just adds memory
> accounting + OOM after hitting the limit) can work as anything but a the
> last safety measure.

I agree -- since PG already has the ability to spill operators, I
think we can avoid using too much "working memory" without too much
coding effort.

Now, PG uses memory for things other than what's captured in "working
memory," so there is probably still some value in a (1) that kills the
query or backend.

> The first challenge I envision is that without any feedback (either to
> the planner or executor), it may break queries quite easily. It just
> takes the planner to add one more sort / hash join / materialize (which
> it can do arbitrarily, as it has no concept of the memory limit), and
> now the query can't run.

Yes -- the planner figures it can allocate another work_mem [*
hash_mem_multiplier] for any new sort / hash join / materialize. It
doesn't know about per-backend memory limits.

That's why (2) is so nice, because it tells the executor (but, to keep
things simple, not the planner) how much working memory it *actually*
has available. And then the executor will (try to) spill rather than
exceed that actual limit.

> And secondly, there are allocations that we don't restrict by work_mem,
> but this memory limit would include them, ofc.

I'm neutral on this subject. The example you gave, of BufFile arrays,
should be included in the working-memory calculation, but there will
always be other sources of memory unaccounted for... I'm neutral on
how to deal with these cases...

> That's a preexisting issue, of course. But wouldn't this simple limit
> make the situation worse? The query would likely complete OK (otherwise
> we'd get many more reports about OOM), but with the new limit it would
> probably fail with OOM.

If I understand your point, you're saying that killing a query due to
OOM is OK if the "OOM" is based on the amount of available *global*
memory going to 0, but you're not comfortable doing this when
*per-backend* memory goes to 0.

I think you're right. A per-backend limit is nice because it can be
distributed to individual operators as per-operator working memory,
because operators will just spill. But killing a query based on an
approximation of how much memory we have seems like an overreaction.

Thanks,
James



pgsql-hackers by date:

Previous
From: James Hunter
Date:
Subject: Re: Add the ability to limit the amount of memory that can be allocated to backends.
Next
From: James Hunter
Date:
Subject: Re: Add the ability to limit the amount of memory that can be allocated to backends.