Thread: checkpoint process use too much memory

checkpoint process use too much memory

From
jian xu
Date:

Hello,

            My pg version is 10.3, last week my user got error :

OperationalError: (psycopg2.OperationalError) could not fork new process for connection: Cannot allocate memory
could not fork new process for connection: Cannot allocate memory
 
when I checked the database, I found checkpoint process used too much memory, htop showed
VIRT: 171GB
RES: 171GB
SHR:4176MB
MEM%: 70%
Command: Posgres: checkpointer process
 
The postgresq.conf is
Shared_buffers = 4GB
Work_mem = 128MB
Maintenance_work_mem = 2GB
Wal_level = replica
Wal_buffer = 16MB
Checkpoint_completion_target = 0.7
 
How can I troubleshooting this error? Why checkpoint process used so much memory? it sounds like a memory leak issue in checkpointer process. 
Thanks for your help!
James

Re: checkpoint process use too much memory

From
Laurenz Albe
Date:
On Sun, 2020-04-19 at 13:56 +0000, jian xu wrote:
> My pg version is 10.3, last week my user got error :
> 
> OperationalError: (psycopg2.OperationalError) could not fork new process for connection: Cannot allocate memory
> could not fork new process for connection: Cannot allocate memory
>  
> when I checked the database, I found checkpoint process used too much memory, htop showed
> VIRT: 171GB
> RES: 171GB
> SHR:4176MB
> MEM%: 70%
> Command: Posgres: checkpointer process
>  
> The postgresq.conf is
> Shared_buffers = 4GB
> Work_mem = 128MB
> Maintenance_work_mem = 2GB
> Wal_level = replica
> Wal_buffer = 16MB
> Checkpoint_completion_target = 0.7
>  
> How can I troubleshooting this error? Why checkpoint process used so much memory? it sounds like a memory leak issue
incheckpointer process. 
 

That is probably a red herring, because the memory reported for the checkpointer
is shared memory that all PostgreSQL processes access.  Utilities like "ps"
show shared memory for all processes that access it, so they don't show the truth.

A better value would be the sum of all the Private_* entries in /proc/<pid>/smaps,
which shows only the private memory.

Not that the error does not necessarily mean that you are out of memory.
It might as well indicate that you are out of open file descriptors for the user
(see /proc/<pid>/limits).

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: checkpoint process use too much memory

From
jian xu
Date:
Thanks Laurenz. that' very helpful!
when we saw this issue, we restarted the pg service. so we lost the process info right now. 
However, we saw the server run out of the memory at that time, the free memory size was very low.
As you said, if this was a red herring, how could pg use so much memory? 
the shared_buffer is 4GB, there were 200 user connections , the work_mem is 128MB, so they could use mostly 25GB memory,
so shared_buffer(4GB) + user connections(25GB) = 29GB. why did htop RES show 171GB?
I am also wondering if this issue recurs, what  information shall I collect, how to figure out which part of pg use most of the memory?
any hint or tool is available?
thanks for your help
James


From: Laurenz Albe <laurenz.albe@cybertec.at>
Sent: Monday, April 20, 2020 5:06
To: jian xu <jamesxu@outlook.com>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
 
On Sun, 2020-04-19 at 13:56 +0000, jian xu wrote:
> My pg version is 10.3, last week my user got error :
>
> OperationalError: (psycopg2.OperationalError) could not fork new process for connection: Cannot allocate memory
> could not fork new process for connection: Cannot allocate memory

> when I checked the database, I found checkpoint process used too much memory, htop showed
> VIRT: 171GB
> RES: 171GB
> SHR:4176MB
> MEM%: 70%
> Command: Posgres: checkpointer process

> The postgresq.conf is
> Shared_buffers = 4GB
> Work_mem = 128MB
> Maintenance_work_mem = 2GB
> Wal_level = replica
> Wal_buffer = 16MB
> Checkpoint_completion_target = 0.7

> How can I troubleshooting this error? Why checkpoint process used so much memory? it sounds like a memory leak issue in checkpointer process.

That is probably a red herring, because the memory reported for the checkpointer
is shared memory that all PostgreSQL processes access.  Utilities like "ps"
show shared memory for all processes that access it, so they don't show the truth.

A better value would be the sum of all the Private_* entries in /proc/<pid>/smaps,
which shows only the private memory.

Not that the error does not necessarily mean that you are out of memory.
It might as well indicate that you are out of open file descriptors for the user
(see /proc/<pid>/limits).

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

Re: checkpoint process use too much memory

From
Scott Ribe
Date:
> On Apr 20, 2020, at 11:52 AM, jian xu <jamesxu@outlook.com> wrote:
>
> the free memory size was very low

Remember that the OS will fill memory with file cache until something else needs it.


Re: checkpoint process use too much memory

From
jian xu
Date:
Thanks Scott. Do you mean the file cache will be counted in postgres checkpointer memory usage?
The htop RES is 171GB, do you mean most of the memory are used by file cache, and counted as htop RES size? 
thanks
James

From: Scott Ribe <scott_ribe@elevated-dev.com>
Sent: Monday, April 20, 2020 14:25
To: jian xu <jamesxu@outlook.com>
Cc: Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
 
> On Apr 20, 2020, at 11:52 AM, jian xu <jamesxu@outlook.com> wrote:
>
> the free memory size was very low

Remember that the OS will fill memory with file cache until something else needs it.

Re: checkpoint process use too much memory

From
Scott Ribe
Date:
> On Apr 20, 2020, at 1:13 PM, jian xu <jamesxu@outlook.com> wrote:
>
> Do you mean the file cache will be counted in postgres checkpointer memory usage?

No, it won't be counted as PG memory. I just mean that free memory low is not a sign of a problem, when it's file
cache.



Re: checkpoint process use too much memory

From
MichaelDBA
Date:
You said: "the work_mem is 128MB, so they could use mostly 25GB memory,"

But work_mem is sometimes very deceiving based on types of queries executed.  You may need many work_mem buffers for a single query. So 8 buffers used = 1GB.  So it could conceivable surpass 25GB easily.  I would do some explaining on your SQL workload.

Regards,
Michael Vitale



Scott Ribe wrote on 4/20/2020 3:23 PM:
On Apr 20, 2020, at 1:13 PM, jian xu <jamesxu@outlook.com> wrote:

Do you mean the file cache will be counted in postgres checkpointer memory usage?
No, it won't be counted as PG memory. I just mean that free memory low is not a sign of a problem, when it's file cache.



Re: checkpoint process use too much memory

From
jian xu
Date:
Good Point, Thanks Michael. Next time I will check the memory usage of other working processes.
looks like it is hard to troubleshooting the postgresql memory usage problem.
Some memory are shared, some are private, some are physical memory, some are virtual memory. 
and the ps command shows the misleading memory usage of postgresql process, 
is there any tool show memory usage for each postgresql process more clearly? 
thanks,
James

From: MichaelDBA <MichaelDBA@sqlexec.com>
Sent: Monday, April 20, 2020 15:37
To: Scott Ribe <scott_ribe@elevated-dev.com>
Cc: jian xu <jamesxu@outlook.com>; Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
 
You said: "the work_mem is 128MB, so they could use mostly 25GB memory,"

But work_mem is sometimes very deceiving based on types of queries executed.  You may need many work_mem buffers for a single query. So 8 buffers used = 1GB.  So it could conceivable surpass 25GB easily.  I would do some explaining on your SQL workload.

Regards,
Michael Vitale



Scott Ribe wrote on 4/20/2020 3:23 PM:
On Apr 20, 2020, at 1:13 PM, jian xu <jamesxu@outlook.com> wrote:

Do you mean the file cache will be counted in postgres checkpointer memory usage?
No, it won't be counted as PG memory. I just mean that free memory low is not a sign of a problem, when it's file cache.



Re: checkpoint process use too much memory

From
jian xu
Date:
hello, 
      we are seeing this issue again, the checkpoint process is eating up all the memory, 
based on the htop, the checkpointer process has
VIRT: 103G
RES: 103G
SHR: 4099M 
MEM%: 41%

I use ps_mem.py to check the process, 
=================================
 Private  +   Shared  =  RAM used Program

 99.5 GiB +  37.5 MiB =  99.5 GiB postgres
---------------------------------
                                           99.5 GiB
=================================
my pg version is 10.3 , shared_buffer size is 4GB. work_mem 64MB. there are about 120 connections on the db side. most of them are idle. 
does anyone have idea how to troubleshooting this kind of issue? 
thank you
James

From: jian xu <jamesxu@outlook.com>
Sent: Tuesday, April 21, 2020 16:07
To: MichaelDBA <MichaelDBA@sqlexec.com>; Scott Ribe <scott_ribe@elevated-dev.com>
Cc: Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
 
Good Point, Thanks Michael. Next time I will check the memory usage of other working processes.
looks like it is hard to troubleshooting the postgresql memory usage problem.
Some memory are shared, some are private, some are physical memory, some are virtual memory. 
and the ps command shows the misleading memory usage of postgresql process, 
is there any tool show memory usage for each postgresql process more clearly? 
thanks,
James

From: MichaelDBA <MichaelDBA@sqlexec.com>
Sent: Monday, April 20, 2020 15:37
To: Scott Ribe <scott_ribe@elevated-dev.com>
Cc: jian xu <jamesxu@outlook.com>; Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
 
You said: "the work_mem is 128MB, so they could use mostly 25GB memory,"

But work_mem is sometimes very deceiving based on types of queries executed.  You may need many work_mem buffers for a single query. So 8 buffers used = 1GB.  So it could conceivable surpass 25GB easily.  I would do some explaining on your SQL workload.

Regards,
Michael Vitale



Scott Ribe wrote on 4/20/2020 3:23 PM:
On Apr 20, 2020, at 1:13 PM, jian xu <jamesxu@outlook.com> wrote:

Do you mean the file cache will be counted in postgres checkpointer memory usage?
No, it won't be counted as PG memory. I just mean that free memory low is not a sign of a problem, when it's file cache.



Re: checkpoint process use too much memory

From
Alvaro Herrera
Date:
On 2020-Jul-20, jian xu wrote:

>       we are seeing this issue again, the checkpoint process is eating up all the memory,
> based on the htop, the checkpointer process has
> VIRT: 103G
> RES: 103G
> SHR: 4099M
> MEM%: 41%
> 
> I use ps_mem.py to check the process,
> https://github.com/pixelb/ps_mem
> =================================
>  Private  +   Shared  =  RAM used Program
> 
>  99.5 GiB +  37.5 MiB =  99.5 GiB postgres
> ---------------------------------
>                                            99.5 GiB
> =================================
> my pg version is 10.3 , shared_buffer size is 4GB. work_mem 64MB. there are about 120 connections on the db side.
mostof them are idle.
 
> does anyone have idea how to troubleshooting this kind of issue?

I see no reason for checkpointer to use that much memory.  Accounting
for the whole of shared_buffers is okay (as Laurenz explaijned), which
in other installations may account for several gigabytes of memory.  But
if shared_buffers is 4GB, then 99GB is clearly overboard.  I see no
explanation for checkpointer using 25x shared_buffers.

Maybe looking at /proc/<PID>/smaps would tell you where that memory is
going?  Have a look there are see if you can figure out some obviously
absurd entries.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: checkpoint process use too much memory

From
Tom Lane
Date:
jian xu <jamesxu@outlook.com> writes:
>       we are seeing this issue again, the checkpoint process is eating up all the memory,
> based on the htop, the checkpointer process has
> VIRT: 103G
> RES: 103G
> SHR: 4099M

hmmm ...

> my pg version is 10.3 ,

Really the *first* thing you ought to do is update to current (10.13)
and see if the problem is still there.  But if it is, maybe collecting
a memory context map in that process would be informative:

https://wiki.postgresql.org/wiki/Developer_FAQ#Examining_backend_memory_use

Note you may need a debug build for that recipe to work.

            regards, tom lane



Re: checkpoint process use too much memory

From
jian xu
Date:
Thanks Tom and Alvaro. I will upgrade it to 10.13 first. 
James


From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Monday, July 20, 2020 18:34
To: jian xu <jamesxu@outlook.com>
Cc: MichaelDBA <MichaelDBA@sqlexec.com>; Scott Ribe <scott_ribe@elevated-dev.com>; Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
 
jian xu <jamesxu@outlook.com> writes:
>       we are seeing this issue again, the checkpoint process is eating up all the memory,
> based on the htop, the checkpointer process has
> VIRT: 103G
> RES: 103G
> SHR: 4099M

hmmm ...

> my pg version is 10.3 ,

Really the *first* thing you ought to do is update to current (10.13)
and see if the problem is still there.  But if it is, maybe collecting
a memory context map in that process would be informative:

https://wiki.postgresql.org/wiki/Developer_FAQ#Examining_backend_memory_use

Note you may need a debug build for that recipe to work.

                        regards, tom lane

Re: checkpoint process use too much memory

From
jian xu
Date:
btw, I checked /proc/<PID>/smaps. it looks like postgres process took most of the memory, the output is like:
===============================================================================
564415881000-56441588f000 rw-p 00857000 08:04 17682620                  postgresql/10/3/dist/bin/postgres
Size:                 56 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:                  56 kB
Pss:                  56 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:        56 kB
Referenced:           56 kB
Anonymous:            56 kB
LazyFree:              0 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
Locked:                0 kB
VmFlags: rd wr mr mw me dw ac sd
5644162cf000-565d4c645000 rw-p 00000000 00:00 0                          [heap]
Size:           105745880 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:            105745692 kB
Pss:            105745692 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:  105745692 kB
Referenced:     104768404 kB
Anonymous:      105745692 kB
LazyFree:              0 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
Locked:                0 kB
VmFlags: rd wr mr mw me ac sd
===============================================================================
however, the shared_buffers size is only 4GB..
postgres=# show shared_buffers ;
 shared_buffers
----------------
 4GB

thanks,

James

From: jian xu <jamesxu@outlook.com>
Sent: Tuesday, July 21, 2020 9:20
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: MichaelDBA <MichaelDBA@sqlexec.com>; Scott Ribe <scott_ribe@elevated-dev.com>; Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
 
Thanks Tom and Alvaro. I will upgrade it to 10.13 first. 
James


From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Monday, July 20, 2020 18:34
To: jian xu <jamesxu@outlook.com>
Cc: MichaelDBA <MichaelDBA@sqlexec.com>; Scott Ribe <scott_ribe@elevated-dev.com>; Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
 
jian xu <jamesxu@outlook.com> writes:
>       we are seeing this issue again, the checkpoint process is eating up all the memory,
> based on the htop, the checkpointer process has
> VIRT: 103G
> RES: 103G
> SHR: 4099M

hmmm ...

> my pg version is 10.3 ,

Really the *first* thing you ought to do is update to current (10.13)
and see if the problem is still there.  But if it is, maybe collecting
a memory context map in that process would be informative:

https://wiki.postgresql.org/wiki/Developer_FAQ#Examining_backend_memory_use

Note you may need a debug build for that recipe to work.

                        regards, tom lane