On 9/30/24 04:32, DBA wrote:
>
> 1. A description of what you are trying to achieve and what results you
> expect :
>
>
> We set shared_buffer = 70GB, max_connections = 300, work_mem = 300MB.
>
>
> Our setup is as above. When monitored by TOP CPU, we confirmed that RES
> uses up to 500MB in one postgres session.
>
> The session memory was initially a low number and then gradually
> increased to 500MB.
>
> We wonder what the contents of session memory have, and wonder if there
> is a way to solve this.
>
> Also, considering our server specification, if things like
> shared_buffer or work_mem need to be adjusted.
>
> I'm inquiring because I often use swap due to lack of memory due to a
> large amount of session memory.
>
I think you may be confused about what RSS means. It is not "private"
session memory, dedicated to the single backend process. For example, as
the processes "touch" shared memory, that'll be counted in RSS too.
If you want to look closer, you can look at "smaps" for each process in
/proc/$PID/smaps, which has info about all the memory. For one of "my"
backends I see this entry:
7f4fecf42000-7f50f7000000 rw-s 00000000 00:01 3126
/dev/zero (deleted)
Size: 4358904 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 1622120 kB
Pss: 323476 kB
Pss_Dirty: 323476 kB
Shared_Clean: 0 kB
Shared_Dirty: 1622120 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 1622120 kB
...
which is clearly shared memory (shared buffers, actually), but it adds
1.6GB to RSS.
It is a bit weird / surprising, but that's what Linux does.
Anyway, I agree shared buffers 70GB is way too high - I'd reduce that to
maybe 8GB and only increase that if cache hit ratio is below 0.95 or so.
Chances are this alone will fix the OOM.
If not, try reducing work_mem. If you're doing OLTP queries, those
likely don't need wm=300MB (why did you set this value?). For OLAP
queries 300MB might make sense, but then maybe you shouldn't have 300 of
them.
If this doesn't help, you need to investigate if there's one query using
too much memory, or if it's simply a the total memory usage. You can
either monitor the system, but the OOM killer should have also logged
stas about the killed process (how much memory it used etc.).
Also, check the memory overcommit setting.
regards
--
Tomas Vondra