Here is an updated patch for tracking Postgres memory usage.
In this new patch, Postgres “reserves” memory, first by updating process-private counters, and then eventually by updating global counters. If the new GUC variable “max_total_memory” is set, reservations exceeding the limit are turned down and treated as though the kernel had reported an out of memory error.
Postgres memory reservations come from multiple sources.
- Malloc calls made by the Postgres memory allocators.
- Static shared memory created by the postmaster at server startup,
- Dynamic shared memory created by the backends.
- A fixed amount (1Mb) of “initial” memory reserved whenever a process starts up.
Each process also maintains an accurate count of its actual memory allocations. The process-private variable “my_memory” holds the total allocations for that process. Since there can be no contention, each process updates its own counters very efficiently.
Pgstat now includes global memory counters. These shared memory counters represent the sum of all reservations made by all Postgres processes. For efficiency, these global counters are only updated when new reservations exceed a threshold, currently 1 Mb for each process. Consequently, the global reservation counters are approximate totals which may differ from the actual allocation totals by up to 1 Mb per process.
The max_total_memory limit is checked whenever the global counters are updated. There is no special error handling if a memory allocation exceeds the global limit. That allocation returns a NULL for malloc style allocations or an ENOMEM for shared memory allocations. Postgres has existing mechanisms for dealing with out of memory conditions.
For sanity checking, pgstat now includes the pg_backend_memory_allocation view showing memory allocations made by the backend process. This view includes a scan of the top memory context, so it compares memory allocations reported through pgstat with actual allocations. The two should match.
Two other views were created as well. pg_stat_global_memory_tracking shows how much server memory has been reserved overall and how much memory remains to be reserved. pg_stat_memory_reservation iterates through the memory reserved by each server process. Both of these views use pgstat’s “snapshot” mechanism to ensure consistent values within a transaction.
Performance-wise, there was no measurable impact with either pgbench or a simple “SELECT * from series” query.