Re: Postgres Out Of Memory Crash - Mailing list pgsql-general

From Merlin Moncure
Subject Re: Postgres Out Of Memory Crash
Date
Msg-id CAHyXU0z0OaVkn=nKVQSo4V=0to=oxAPtG_N233Lf0Z9-ycW0Dg@mail.gmail.com
Whole thread Raw
In response to Postgres Out Of Memory Crash  (Avi Weinberg <AviW@gilat.com>)
List pgsql-general


On Thu, Nov 2, 2023 at 4:13 AM Avi Weinberg <AviW@gilat.com> wrote:

Hi experts,

 

I'm using Patroni Postgres installation and noticed that twice already postgres crashed due to out of memory.  I'm using logical replication with around 30-40 active subscribers on this machine.  The machine has 128GB but only 32GB is allocated to Postgres.  How can I know what is actually causing the out of memory issue?  Is it caused by not optimal postgres configuration or something else?

 

    /usr/lib/postgresql/13/bin/postgres -D /home/postgres/pgdata/pgroot/data --config-file=/home/postgres/pgdata/pgroot/data/postgresql.conf --port=5432 --cluster_name=postgres-cluster --wal_level=logical --hot_standby=on --max_connections=533 --max_wal_senders=90 --max_prepared_transactions=0 --max_locks_per_transaction=64 --track_commit_timestamp=on --max_replication_slots=90 --max_worker_processes=30 --wal_log_hints=on


Couple things here.   You don't really allocate memory to postgres, rather you set up memory reserved for certain kinds of buffering operations via shared buffers, and other less important settings.   This memory is carefully managed, and is likely not underneath your oom condition. Ironically, increasing shared buffers can make your problem more likely to occur as you are taking memory from other tasks.

>  --max_connections=533 

Probably your problem is at least partially here.   This number was configured specifically, implying you are running out of connections and had to crank this value. 

If you have a lot of idle connections sitting around (say, if you have several application servers managing connection pools),  a sudden spike in memory load and/or connection utilization can cause this to occur.  Suggest one or more of:
* lowering shared buffers
* implementing pgbouncer and lowering max_connections
* increasing physical memory

Also,
> I'm using logical replication with around 30-40 active subscribers on this machine. 

Hm.  Have you considered streaming replica setup, so that you can attach read only processes to the standby?

merlin

pgsql-general by date:

Previous
From: Matthias Leisi
Date:
Subject: Cluster for an appliance-type deployment
Next
From: jian he
Date:
Subject: procedure string constant is parsed at procedure create time.