Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet - Mailing list pgsql-performance

From Joe Conway
Subject Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet
Date
Msg-id f35cdf25-411f-4cbc-a877-85569cf78061@joeconway.com
Whole thread Raw
In response to Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet  (Priya V <mailme0216@gmail.com>)
Responses Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet
List pgsql-performance
(Both: please trim and reply inline on these lists as I have done;
  Frits, please reply all not just to the list -- I never received your
  reply to me)

On 8/6/25 11:51, Priya V wrote:
> *cat /proc/sys/vm/overcommit_ratio*
> 50
> $ *cat /proc/sys/vm/swappiness*
> 60
> 
> *Workload*: Multi-tenant PostgreSQL
> 
> *uname -r*
> 4.18.0-477.83.1.el8_8.x86_64

IMHO you should strongly consider getting on a more recent distro with a 
newer kernel.

> *free -h*
> total used free shared buff/cache available
> Mem: 249Gi 4.3Gi 1.7Gi 22Gi 243Gi 221Gi
> Swap: 0B 0B 0B

As I said, do not disable swap. You don't need a huge amount, but maybe 
16 GB or so would do it.

> if we set overcommit_memory = 2, what should we set the 
> overcommit_ration value to ? Can you pls suggest ?
> Is there a rule of thumb to go with ?

There is no rule of thumb that I am aware of. Every workload is 
different. Start with something like 80 and do your own testing to 
refine that number.

> *Our goal is to not run into OOM issues, no memory wastage and also not 
> starve kernel ? *

With overcommit_memory = 2, swap on (and reasonably sized), and 
overcommit_ratio to something reasonable (certainly below 100), I think 
you will have a difficult time getting an OOM kill even if you try 
during testing. But you have to do your own testing for your workloads 
-- there is no magic button here.

That is, unless you run postgres in a cgroup with memory.limit (cgroup 
v1) or memory.max (cgroup v2) set. Note, running in containers with 
memory limits set e.g. via Kubernetes will do that under the covers. 
That is a completely different story.

> On Wed, Aug 6, 2025 at 3:47 AM Frits Hoogland <frits.hoogland@gmail.com 
> <mailto:frits.hoogland@gmail.com>> wrote:
>     Can you name any technical reason why not having swap for a database
>     is an actual bad idea?

Did you read the blog I linked? Do your own experiments.

* Swap is what is used when anonymous memory must be reclaimed to allow 
for an allocation of anonymous memory.

* The Linux kernel will aggressively use all available memory for file 
buffers, pushing usage against the limits.

* Especially in the older 4 series kernels, file buffers often cannot be 
reclaimed fast enough

* With no swap and a large-ish anonymous memory request, it is easy to 
push over the limit to cause the OOM killer to strike.

* On the other hand, with swap enabled anon memory can be reclaimed 
giving the kernel more time to deal with file buffer reclamation.

At least that is what I have observed.

HTH,

-- 
Joe Conway
PostgreSQL Contributors Team
Amazon Web Services: https://aws.amazon.com



pgsql-performance by date:

Previous
From: Priya V
Date:
Subject: Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet
Next
From: Frits Hoogland
Date:
Subject: Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet