Re: Getting out ahead of OOM - Mailing list pgsql-admin
From | Joe Conway |
---|---|
Subject | Re: Getting out ahead of OOM |
Date | |
Msg-id | efcca885-a954-43e2-99b8-8b993678c72c@joeconway.com Whole thread Raw |
In response to | Re: Getting out ahead of OOM (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Getting out ahead of OOM
|
List | pgsql-admin |
On 3/7/25 14:26, Tom Lane wrote: > Joseph Hammerman <joe.hammerman@datadoghq.com> writes: >> We run Postgres in a Kubernetes environment, and we have not to date been >> able to convince our Compute team to create a class of Kubernetes hosts >> that have memory overcommit disabled. > > :-( > >> Has anyone had success tracking all the Postgres memory allocation >> configurables and using that to administratively prevent OOMing? > > I doubt anyone has tried that. I would look into whether running > the postmaster under a suitable ulimit helps. I seem to recall > discussions that in Linux, "ulimit -v" works better than the other > likely-looking options. But that might be stale information. Problem with ulimit is that it is per process, but within a Kubernetes pod the memory accounting is for all the pod's processes. >> Alternatively, has anyone has success implementing an extension or periodic >> process to monitor the memory consumption of the Postgres children and >> killing them before the OOM event occurs? > > That's not going to be noticeably nicer than the kernel-induced > OOM, I think. The one thing it might do for you is ensure that > the kill happens to a child process and not the postmaster; but > you can already use PG_OOM_ADJUST_VALUE and PG_OOM_ADJUST_FILE > to manage that if it's a problem. (Recent kernels are alleged > to usually do the right thing without that, though.) Actually the problem here is likely that the Kubernetes Postgres pod was started with a memory limit. Disabling memory overcommit at the lost level will not help you if there is a memory limit set for the pod because that in turn sets memory.limit for the cgroup related to the pod and the oom killer will strike when memory.usage_in_bytes exceeds that value irrespective of the free memory at the host level. In these cases the oom_score_adj values don't end up mattering much. This is a fairly complex topic -- I wrote a blog a few years ago which may or may not be out of date at this point: https://www.crunchydata.com/blog/deep-postgresql-thoughts-the-linux-assassin Additionally Jeremy Schneider wrote a more recent one that you might find helpful: https://ardentperf.com/2024/09/22/kubernetes-requests-and-limits-for-postgres/ My quick and dirty recommendations: 1. Use cgroup v2 on the host if at all possible 2. Do not under any circumstances disable swap on the host. This is an anti-pattern unfortunately followed widely the last time I looked. 3. If nothing else, avoid setting a memory.limit on the cgroup. That will at least get you back to not getting whacked unless there is host level memory pressure. The blogs discuss how to do that with Kube pod settings. HTH, -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-admin by date: