Home > mailing lists

Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet - Mailing list pgsql-performance

From	Priya V
Subject	Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet
Date	August 5 20:01:19
Msg-id	CAFsZ43xFxjSiONwRccXBQXZrPRd+Lh7XAkSVEG1ai165xPcoDA@mail.gmail.com Whole thread Raw
Responses	Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet
List	pgsql-performance

Tree view

Hello Postgres community,

We operate a large PostgreSQL fleet (~15,000 databases) on dedicated Linux hosts.
Each host runs multiple PostgreSQL instances (multi-instance setup, not just multiple DBs inside one instance).

Environment:

PostgreSQL Versions: Mix of 13.13 and 15.12 (upgrades in progress to be at 15.12 currently both are actively in use)
OS / Kernel: RHEL 7 & RHEL 8 variants, kernels in the 4.14–4.18 range
RAM: 256 GiB (varies slightly)
Swap: Currently none
Workload: Highly mixed — OLTP-style internal apps with unpredictable query patterns and connection counts
Goal: Uniform, safe memory settings across the fleet to avoid kernel or database instability

We’re reviewing vm.overcommit_* settings because we’ve seen conflicting guidance:

vm.overcommit_memory = 2 gives predictability but can reject allocations early
vm.overcommit_memory = 1 is more flexible but risks OOM kills if many backends hit peak memory usage at once

We’re considering:

vm.overcommit_memory = 2 for strict accounting
Increasing vm.overcommit_ratio from 50 → 80 or 90 to better reflect actual PostgreSQL usage (e.g., work_mem reservations that aren’t fully used)

Our questions for those running large PostgreSQL fleets:

What overcommit_ratio do you find safe for PostgreSQL without causing kernel memory crunches?
Do you prefer overcommit_memory = 1 or = 2 for production stability?
How much swap (if any) do you keep in large-memory servers where PostgreSQL is the primary workload? Is having swap configured a good idea or not ?
Any real-world cases where kernel accounting was too strict or too loose for PostgreSQL?
What settings to go with if we are not planning on using swap ?

We’d like to avoid both extremes:

Too low a ratio → PostgreSQL backends failing allocations even with free RAM
Too high a ratio → OOM killer terminating PostgreSQL under load spikes

Any operational experiences, tuning recommendations, or kernel/PG interaction pitfalls would be very helpful.

TIA

pgsql-performance by date:

From: Bruce Momjian
Date: 28 July, 16:31:16
Subject: Making Postgres slower

From: Joe Conway
Date: 05 August, 21:52:25
Subject: Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet

Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet - Mailing list pgsql-performance

Previous

Next