Re: what are the things that occupy the session memory. - Mailing list pgsql-bugs

From Tomas Vondra
Subject Re: what are the things that occupy the session memory.
Date
Msg-id 6962bb55-37fd-4b8d-83ca-822d2087b1a2@vondra.me
Whole thread Raw
Responses Re: what are the things that occupy the session memory.
List pgsql-bugs

On 9/30/24 04:32, DBA wrote:
> 
> 1. A description of what you are trying to achieve and what results you
> expect : 
> 
> 
> We set shared_buffer = 70GB, max_connections = 300, work_mem = 300MB.
> 
> 
> Our setup is as above. When monitored by TOP CPU, we confirmed that RES
> uses up to 500MB in one postgres session. 
> 
> The session memory was initially a low number and then gradually
> increased to 500MB.
> 
> We wonder what the contents of session memory have, and wonder if there
> is a way to solve this. 
> 
> Also, considering our server specification,  if things like
> shared_buffer or work_mem need to be adjusted.
> 
> I'm inquiring because I often use swap due to lack of memory due to a
> large amount of session memory.
> 

I think you may be confused about what RSS means. It is not "private"
session memory, dedicated to the single backend process. For example, as
the processes "touch" shared memory, that'll be counted in RSS too.

If you want to look closer, you can look at "smaps" for each process in
/proc/$PID/smaps, which has info about all the memory. For one of "my"
backends I see this entry:

  7f4fecf42000-7f50f7000000 rw-s 00000000 00:01 3126
                                                 /dev/zero (deleted)
  Size:            4358904 kB
  KernelPageSize:        4 kB
  MMUPageSize:           4 kB
  Rss:             1622120 kB
  Pss:              323476 kB
  Pss_Dirty:        323476 kB
  Shared_Clean:          0 kB
  Shared_Dirty:    1622120 kB
  Private_Clean:         0 kB
  Private_Dirty:         0 kB
  Referenced:      1622120 kB
  ...

which is clearly shared memory (shared buffers, actually), but it adds
1.6GB to RSS.

It is a bit weird / surprising, but that's what Linux does.

Anyway, I agree shared buffers 70GB is way too high - I'd reduce that to
maybe 8GB and only increase that if cache hit ratio is below 0.95 or so.
Chances are this alone will fix the OOM.

If not, try reducing work_mem. If you're doing OLTP queries, those
likely don't need wm=300MB (why did you set this value?). For OLAP
queries 300MB might make sense, but then maybe you shouldn't have 300 of
them.

If this doesn't help, you need to investigate if there's one query using
too much memory, or if it's simply a the total memory usage. You can
either monitor the system, but the OOM killer should have also logged
stas about the killed process (how much memory it used etc.).

Also, check the memory overcommit setting.


regards

-- 
Tomas Vondra




pgsql-bugs by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Linux OOM killer
Next
From: Andrei Lepikhov
Date:
Subject: Re: BUG #18643: EXPLAIN estimated rows mismatch