Home > mailing lists

Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet - Mailing list pgsql-performance

From	Priya V
Subject	Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet
Date	August 6 18:51:29
Msg-id	CAFsZ43xF00OHNg6d8iD0zd0OCpWe8+r9p4A+HAupo0jxOdYgpQ@mail.gmail.com Whole thread Raw
In response to	Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet (Frits Hoogland <frits.hoogland@gmail.com>)
Responses	Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet
List	pgsql-performance

Tree view

Hi Frits, Joe,

Thank you both for you insights

Current situation:

cat /proc/sys/vm/overcommit_memory
0

cat /proc/sys/vm/overcommit_ratio
50

$ cat /proc/sys/vm/swappiness
60

Workload: Multi-tenant PostgreSQL

uname -r
4.18.0-477.83.1.el8_8.x86_64

free -h
total used free shared buff/cache available
Mem: 249Gi 4.3Gi 1.7Gi 22Gi 243Gi 221Gi
Swap: 0B 0B 0B

if we set overcommit_memory = 2, what should we set the overcommit_ration value to ? Can you pls suggest ?

Is there a rule of thumb to go with ?

Our goal is to not run into OOM issues, no memory wastage and also not starve kernel ?

Thanks!

On Wed, Aug 6, 2025 at 3:47 AM Frits Hoogland <frits.hoogland@gmail.com> wrote:

Joe,

Can you name any technical reason why not having swap for a database is an actual bad idea?

Memory always is limited. Swap was invented to overcome a situation where the (incidental) memory usage of paged in memory was could (regularly) get higher than physical memory would allow, and thus have the (clear) workaround of having swap to 'cushion' the memory shortage issue by allowing a "second level" memory storage on disk.
Still, this does not making memory unlimited. Swap extends the physical memory available with the amount of swap. There still is a situation where you can run out of memory when swap is added, simply by paging in more memory than physical memory and swap.

Today, most systems are not memory constrained anymore, or: it is possible to get a server with enough physical memory to hold your common needed total memory need.
And given the latency sensitive nature of databases in general, which includes postgres, for any serious deployment you should get a server with enough memory to host your workload, and configure postgres not to overload the memory.

If you do oversubscribe on (physical) memory, you will get pain somewhere, there is no way around that.
The article in defense of swap in essence is saying that if you happen to oversubscribe on memory, sharing the pain between anonymous and file is better.
I would say you are already in a bad place if that happens, which is especially bad for databases, and databases should allow you to make memory usage predictable.

However, what I found is that with 4+ kernels (4.18 to be precise; rhel 8), the kernel can try to favour file pages in certain situations making anonymous memory getting paged out even if swappiness is set to 1 or 0, and if there is a wealth of inactive file memory. It seems to have to do with workingset protection(?) mechanisms, but given the lack of clear statistics I can't be sure about that. What it does lead to in my situations is a constant rate of swapping in and out in certain situations, whilst there is no technical reason for linux to do so because there is enough available memory.

My point of view has been that vm.overcommit_memory set to 2 was the way to go, because that allows linux to limit based on a set limit on allocation time, which guarantees way to make the database never run out of memory.
it does guarantees linux to never run out of memory, absolutely.
However, this limit is hard, and is applied for the process at both usermode and system mode (kernel level), and thus can enforce not providing memory at times where it's not safe to do so, and thus corrupt execution. I have to be honest, I have not seen this myself, but trustworthy sources have reported this repeatedly, which I am inclined to believe. This means postgres execution can corrupt/terminate in unlucky situations, which is impacts availability.

Frits Hoogland

On 5 Aug 2025, at 20:52, Joe Conway <mail@joeconway.com> wrote:

On 8/5/25 13:01, Priya V wrote:
*Environment:*
   *PostgreSQL Versions:* Mix of 13.13 and 15.12 (upgrades in progress
   to be at 15.12 currently both are actively in use)

PostgreSQL 13 end of life after November 13, 2025

   *OS / Kernel:* RHEL 7 & RHEL 8 variants, kernels in the 4.14–4.18 range

RHEL 7 has been EOL for quite a while now. Note that you have to watch out for collation issues/corrupted indexes after OS upgrades due to collations changing with newer glibc versions.

   *Swap:* Currently none

bad idea

   *Workload:* Highly mixed — OLTP-style internal apps with
   unpredictable query patterns and connection counts
   *Goal:* Uniform, safe memory settings across the fleet to avoid
   kernel or database instability

We’re considering:
   *|vm.overcommit_memory = 2|* for strict accounting

yes

   Increasing |vm.overcommit_ratio| from 50 → 80 or 90 to better
   reflect actual PostgreSQL usage (e.g., |work_mem| reservations that
   aren’t fully used)

work_mem does not reserve memory -- it is a maximum that might be used in memory for a particular operation

*Our questions for those running large PostgreSQL fleets:*
1.
   What |overcommit_ratio| do you find safe for PostgreSQL without
   causing kernel memory crunches?

Read this:
https://www.cybertec-postgresql.com/en/what-you-should-know-about-linux-memory-overcommit-in-postgresql/

2.
   Do you prefer |overcommit_memory = 1| or |= 2| for production stability?

Use overcommit_memory = 2 for production stability

3.
   How much swap (if any) do you keep in large-memory servers where
   PostgreSQL is the primary workload? Is having swap configured a good
   idea or not ?

You don't necessary need a large amount of swap, but you definitely should not disable it.

Some background on that:
https://chrisdown.name/2018/01/02/in-defence-of-swap.html

4.
   Any real-world cases where kernel accounting was too strict or too
   loose for PostgreSQL?

In my experience the biggest issues are when postgres is running in a memory constrained cgroup. If you want to constrain memory with cgroups, use cgroup v2 (not 1) and use memory.high to constrain it, not memory.max.

5. What settings to go with if we are not planning on using swap ?

IMHO do not disable swap on Linux, at least not on production, ever.

We’d like to avoid both extremes:
   Too low a ratio → PostgreSQL backends failing allocations even with
   free RAM

Have you actually seen this or are you theorizing?

   Too high a ratio → OOM killer terminating PostgreSQL under load spikes

If overcommit_memory = 2, overcommit_ratio is reasonable (less than 100, maybe 80 or so as you suggested), and swap is not disabled, and you are not running in a memory constrained cgroup, I would be very surprised if you will ever get hit by the OOM killer. And if you do, things are so bad the database was probably dying anyway.

HTH,

--
Joe Conway
PostgreSQL Contributors Team
Amazon Web Services: https://aws.amazon.com

pgsql-performance by date:

From: Frits Hoogland
Date: 06 August, 11:47:29
Subject: Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet

From: Joe Conway
Date: 06 August, 19:33:49
Subject: Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet

Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet - Mailing list pgsql-performance

Previous

Next