Re: need help on memory allocation - Mailing list pgsql-performance

From Jeff Janes
Subject Re: need help on memory allocation
Date
Msg-id CAMkU=1zG0EP36=rGWCwrCO6OLCGJdcCuUbv7yDj5AnqvpRZ=Ug@mail.gmail.com
Whole thread Raw
In response to Re: need help on memory allocation  (Rambabu V <ram.wissen@gmail.com>)
List pgsql-performance
On Tue, Jan 23, 2018 at 5:59 AM, Rambabu V <ram.wissen@gmail.com> wrote:

> > cat PostgreSQL-2018-01-23_060000.csv|grep FATAL

What about ERROR, not just FATAL?  Or grep for "out of memory"



$ free -mh
             total       used       free     shared    buffers     cached
Mem:           58G        58G       358M        16G       3.6M        41G
-/+ buffers/cache:        16G        42G
Swap:         9.5G       687M       8.9G

This does not seem like it should be a problem.  Is this data collected near the time of the failure?
 
work_mem = 256MB # min 64kB
max_connections = 600

These look pretty high, especially in combination.  Why do you need that number of connections?  Could you use a connection pooler instead?  Or do just have an application bug (leaked connection handles) that needs to be fixed?  Why do you need that amount of work_mem?
 
ps -ef|grep postgres|grep idle|wc -l
171

ps -ef|grep postgres|wc -l
206

How close to the time of the problem was this recorded?  How many of the idle are 'idle in transaction'?


    PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
 109063 postgres   20   0 16.7G 16.4G 16.3G S  0.0 27.8 39:55.61 postgres: test sss 10.20.2.228(55174) idle
  24910 postgres   20   0 16.7G 16.4G 16.3G S  0.0 27.8 27:45.35 postgres: testl sss 10.20.2.228(55236) idle
 115539 postgres   20   0 16.7G 16.4G 16.3G S  0.0 27.8 28:22.89 postgres: test sss 10.20.2.228(55184) idle
   9816 postgres   20   0 16.7G 16.4G 16.3G S  0.0 27.8 40:19.57 postgres: test sss   10.20.2.228(55216) idle
 
How close to the time of the problem was this recorded?  Nothing here seems to be a problem, because almost all the memory they have resident is shared memory.
 
It looks like all your clients decide to run a memory hungry query simultaneously, consume a lot of work_mem, and cause a problem.  Then by the time you notice the problem and start collecting information, they are done and things are back to normal.

Cheers,

Jeff

pgsql-performance by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: need help on memory allocation
Next
From: Claudio Freire
Date:
Subject: Re: 8.2 Autovacuum BUG ?