Thread: checkpoint process use too much memory
Hello,
My pg version is 10.3, last week my user got error :
OperationalError: (psycopg2.OperationalError) could not fork new process for connection: Cannot allocate memory
could not fork new process for connection: Cannot allocate memory
when I checked the database, I found checkpoint process used too much memory, htop showed
VIRT: 171GB
RES: 171GB
SHR:4176MB
MEM%: 70%
Command: Posgres: checkpointer process
The postgresq.conf is
Shared_buffers = 4GB
Work_mem = 128MB
Maintenance_work_mem = 2GB
Wal_level = replica
Wal_buffer = 16MB
Checkpoint_completion_target = 0.7
How can I troubleshooting this error? Why checkpoint process used so much memory? it sounds like a memory leak issue in checkpointer process.
Thanks for your help!
James
On Sun, 2020-04-19 at 13:56 +0000, jian xu wrote: > My pg version is 10.3, last week my user got error : > > OperationalError: (psycopg2.OperationalError) could not fork new process for connection: Cannot allocate memory > could not fork new process for connection: Cannot allocate memory > > when I checked the database, I found checkpoint process used too much memory, htop showed > VIRT: 171GB > RES: 171GB > SHR:4176MB > MEM%: 70% > Command: Posgres: checkpointer process > > The postgresq.conf is > Shared_buffers = 4GB > Work_mem = 128MB > Maintenance_work_mem = 2GB > Wal_level = replica > Wal_buffer = 16MB > Checkpoint_completion_target = 0.7 > > How can I troubleshooting this error? Why checkpoint process used so much memory? it sounds like a memory leak issue incheckpointer process. That is probably a red herring, because the memory reported for the checkpointer is shared memory that all PostgreSQL processes access. Utilities like "ps" show shared memory for all processes that access it, so they don't show the truth. A better value would be the sum of all the Private_* entries in /proc/<pid>/smaps, which shows only the private memory. Not that the error does not necessarily mean that you are out of memory. It might as well indicate that you are out of open file descriptors for the user (see /proc/<pid>/limits). Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Thanks Laurenz. that' very helpful!
when we saw this issue, we restarted the pg service. so we lost the process info right now.
However, we saw the server run out of the memory at that time, the free memory size was very low.
As you said, if this was a red herring, how could pg use so much memory?
the shared_buffer is 4GB, there were 200 user connections , the work_mem is 128MB, so they could use mostly 25GB memory,
so shared_buffer(4GB) + user connections(25GB) = 29GB. why did htop RES show 171GB?
I am also wondering if this issue recurs, what information shall I collect, how to figure out which part of pg use most of the memory?
any hint or tool is available?
thanks for your help
James
From: Laurenz Albe <laurenz.albe@cybertec.at>
Sent: Monday, April 20, 2020 5:06
To: jian xu <jamesxu@outlook.com>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
Sent: Monday, April 20, 2020 5:06
To: jian xu <jamesxu@outlook.com>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
On Sun, 2020-04-19 at 13:56 +0000, jian xu wrote:
> My pg version is 10.3, last week my user got error :
>
> OperationalError: (psycopg2.OperationalError) could not fork new process for connection: Cannot allocate memory
> could not fork new process for connection: Cannot allocate memory
>
> when I checked the database, I found checkpoint process used too much memory, htop showed
> VIRT: 171GB
> RES: 171GB
> SHR:4176MB
> MEM%: 70%
> Command: Posgres: checkpointer process
>
> The postgresq.conf is
> Shared_buffers = 4GB
> Work_mem = 128MB
> Maintenance_work_mem = 2GB
> Wal_level = replica
> Wal_buffer = 16MB
> Checkpoint_completion_target = 0.7
>
> How can I troubleshooting this error? Why checkpoint process used so much memory? it sounds like a memory leak issue in checkpointer process.
That is probably a red herring, because the memory reported for the checkpointer
is shared memory that all PostgreSQL processes access. Utilities like "ps"
show shared memory for all processes that access it, so they don't show the truth.
A better value would be the sum of all the Private_* entries in /proc/<pid>/smaps,
which shows only the private memory.
Not that the error does not necessarily mean that you are out of memory.
It might as well indicate that you are out of open file descriptors for the user
(see /proc/<pid>/limits).
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
> My pg version is 10.3, last week my user got error :
>
> OperationalError: (psycopg2.OperationalError) could not fork new process for connection: Cannot allocate memory
> could not fork new process for connection: Cannot allocate memory
>
> when I checked the database, I found checkpoint process used too much memory, htop showed
> VIRT: 171GB
> RES: 171GB
> SHR:4176MB
> MEM%: 70%
> Command: Posgres: checkpointer process
>
> The postgresq.conf is
> Shared_buffers = 4GB
> Work_mem = 128MB
> Maintenance_work_mem = 2GB
> Wal_level = replica
> Wal_buffer = 16MB
> Checkpoint_completion_target = 0.7
>
> How can I troubleshooting this error? Why checkpoint process used so much memory? it sounds like a memory leak issue in checkpointer process.
That is probably a red herring, because the memory reported for the checkpointer
is shared memory that all PostgreSQL processes access. Utilities like "ps"
show shared memory for all processes that access it, so they don't show the truth.
A better value would be the sum of all the Private_* entries in /proc/<pid>/smaps,
which shows only the private memory.
Not that the error does not necessarily mean that you are out of memory.
It might as well indicate that you are out of open file descriptors for the user
(see /proc/<pid>/limits).
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
> On Apr 20, 2020, at 11:52 AM, jian xu <jamesxu@outlook.com> wrote: > > the free memory size was very low Remember that the OS will fill memory with file cache until something else needs it.
Thanks Scott. Do you mean the file cache will be counted in postgres checkpointer memory usage?
The htop RES is 171GB, do you mean most of the memory are used by file cache, and counted as htop RES size?
thanks
James
From: Scott Ribe <scott_ribe@elevated-dev.com>
Sent: Monday, April 20, 2020 14:25
To: jian xu <jamesxu@outlook.com>
Cc: Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
Sent: Monday, April 20, 2020 14:25
To: jian xu <jamesxu@outlook.com>
Cc: Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
> On Apr 20, 2020, at 11:52 AM, jian xu <jamesxu@outlook.com> wrote:
>
> the free memory size was very low
Remember that the OS will fill memory with file cache until something else needs it.
>
> the free memory size was very low
Remember that the OS will fill memory with file cache until something else needs it.
> On Apr 20, 2020, at 1:13 PM, jian xu <jamesxu@outlook.com> wrote: > > Do you mean the file cache will be counted in postgres checkpointer memory usage? No, it won't be counted as PG memory. I just mean that free memory low is not a sign of a problem, when it's file cache.
You said: "the work_mem is 128MB, so they could use mostly 25GB memory,"
But work_mem is sometimes very deceiving based on types of queries executed. You may need many work_mem buffers for a single query. So 8 buffers used = 1GB. So it could conceivable surpass 25GB easily. I would do some explaining on your SQL workload.
Regards,
Michael Vitale
Scott Ribe wrote on 4/20/2020 3:23 PM:
But work_mem is sometimes very deceiving based on types of queries executed. You may need many work_mem buffers for a single query. So 8 buffers used = 1GB. So it could conceivable surpass 25GB easily. I would do some explaining on your SQL workload.
Regards,
Michael Vitale
Scott Ribe wrote on 4/20/2020 3:23 PM:
On Apr 20, 2020, at 1:13 PM, jian xu <jamesxu@outlook.com> wrote: Do you mean the file cache will be counted in postgres checkpointer memory usage?No, it won't be counted as PG memory. I just mean that free memory low is not a sign of a problem, when it's file cache.
Good Point, Thanks Michael. Next time I will check the memory usage of other working processes.
looks like it is hard to troubleshooting the postgresql memory usage problem.
Some memory are shared, some are private, some are physical memory, some are virtual memory.
and the ps command shows the misleading memory usage of postgresql process,
is there any tool show memory usage for each postgresql process more clearly?
thanks,
James
From: MichaelDBA <MichaelDBA@sqlexec.com>
Sent: Monday, April 20, 2020 15:37
To: Scott Ribe <scott_ribe@elevated-dev.com>
Cc: jian xu <jamesxu@outlook.com>; Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
Sent: Monday, April 20, 2020 15:37
To: Scott Ribe <scott_ribe@elevated-dev.com>
Cc: jian xu <jamesxu@outlook.com>; Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
You said: "the work_mem is 128MB, so they could use mostly 25GB memory,"
But work_mem is sometimes very deceiving based on types of queries executed. You may need many work_mem buffers for a single query. So 8 buffers used = 1GB. So it could conceivable surpass 25GB easily. I would do some explaining on your SQL workload.
Regards,
Michael Vitale
Scott Ribe wrote on 4/20/2020 3:23 PM:
But work_mem is sometimes very deceiving based on types of queries executed. You may need many work_mem buffers for a single query. So 8 buffers used = 1GB. So it could conceivable surpass 25GB easily. I would do some explaining on your SQL workload.
Regards,
Michael Vitale
Scott Ribe wrote on 4/20/2020 3:23 PM:
On Apr 20, 2020, at 1:13 PM, jian xu <jamesxu@outlook.com> wrote: Do you mean the file cache will be counted in postgres checkpointer memory usage?No, it won't be counted as PG memory. I just mean that free memory low is not a sign of a problem, when it's file cache.
hello,
we are seeing this issue again, the checkpoint process is eating up all the memory,
based on the htop, the checkpointer process has
VIRT: 103G
RES: 103G
SHR: 4099M
MEM%: 41%
I use ps_mem.py to check the process,
=================================
Private + Shared = RAM used Program
99.5 GiB + 37.5 MiB = 99.5 GiB postgres
---------------------------------
99.5 GiB
=================================
my pg version is 10.3 , shared_buffer size is 4GB. work_mem 64MB. there are about 120 connections on the db side. most of them are idle. does anyone have idea how to troubleshooting this kind of issue?
thank you
James
From: jian xu <jamesxu@outlook.com>
Sent: Tuesday, April 21, 2020 16:07
To: MichaelDBA <MichaelDBA@sqlexec.com>; Scott Ribe <scott_ribe@elevated-dev.com>
Cc: Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
Sent: Tuesday, April 21, 2020 16:07
To: MichaelDBA <MichaelDBA@sqlexec.com>; Scott Ribe <scott_ribe@elevated-dev.com>
Cc: Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
Good Point, Thanks Michael. Next time I will check the memory usage of other working processes.
looks like it is hard to troubleshooting the postgresql memory usage problem.
Some memory are shared, some are private, some are physical memory, some are virtual memory.
and the ps command shows the misleading memory usage of postgresql process,
is there any tool show memory usage for each postgresql process more clearly?
thanks,
James
From: MichaelDBA <MichaelDBA@sqlexec.com>
Sent: Monday, April 20, 2020 15:37
To: Scott Ribe <scott_ribe@elevated-dev.com>
Cc: jian xu <jamesxu@outlook.com>; Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
Sent: Monday, April 20, 2020 15:37
To: Scott Ribe <scott_ribe@elevated-dev.com>
Cc: jian xu <jamesxu@outlook.com>; Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
You said: "the work_mem is 128MB, so they could use mostly 25GB memory,"
But work_mem is sometimes very deceiving based on types of queries executed. You may need many work_mem buffers for a single query. So 8 buffers used = 1GB. So it could conceivable surpass 25GB easily. I would do some explaining on your SQL workload.
Regards,
Michael Vitale
Scott Ribe wrote on 4/20/2020 3:23 PM:
But work_mem is sometimes very deceiving based on types of queries executed. You may need many work_mem buffers for a single query. So 8 buffers used = 1GB. So it could conceivable surpass 25GB easily. I would do some explaining on your SQL workload.
Regards,
Michael Vitale
Scott Ribe wrote on 4/20/2020 3:23 PM:
On Apr 20, 2020, at 1:13 PM, jian xu <jamesxu@outlook.com> wrote: Do you mean the file cache will be counted in postgres checkpointer memory usage?No, it won't be counted as PG memory. I just mean that free memory low is not a sign of a problem, when it's file cache.
On 2020-Jul-20, jian xu wrote: > we are seeing this issue again, the checkpoint process is eating up all the memory, > based on the htop, the checkpointer process has > VIRT: 103G > RES: 103G > SHR: 4099M > MEM%: 41% > > I use ps_mem.py to check the process, > https://github.com/pixelb/ps_mem > ================================= > Private + Shared = RAM used Program > > 99.5 GiB + 37.5 MiB = 99.5 GiB postgres > --------------------------------- > 99.5 GiB > ================================= > my pg version is 10.3 , shared_buffer size is 4GB. work_mem 64MB. there are about 120 connections on the db side. mostof them are idle. > does anyone have idea how to troubleshooting this kind of issue? I see no reason for checkpointer to use that much memory. Accounting for the whole of shared_buffers is okay (as Laurenz explaijned), which in other installations may account for several gigabytes of memory. But if shared_buffers is 4GB, then 99GB is clearly overboard. I see no explanation for checkpointer using 25x shared_buffers. Maybe looking at /proc/<PID>/smaps would tell you where that memory is going? Have a look there are see if you can figure out some obviously absurd entries. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
jian xu <jamesxu@outlook.com> writes: > we are seeing this issue again, the checkpoint process is eating up all the memory, > based on the htop, the checkpointer process has > VIRT: 103G > RES: 103G > SHR: 4099M hmmm ... > my pg version is 10.3 , Really the *first* thing you ought to do is update to current (10.13) and see if the problem is still there. But if it is, maybe collecting a memory context map in that process would be informative: https://wiki.postgresql.org/wiki/Developer_FAQ#Examining_backend_memory_use Note you may need a debug build for that recipe to work. regards, tom lane
Thanks Tom and Alvaro. I will upgrade it to 10.13 first.
James
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Monday, July 20, 2020 18:34
To: jian xu <jamesxu@outlook.com>
Cc: MichaelDBA <MichaelDBA@sqlexec.com>; Scott Ribe <scott_ribe@elevated-dev.com>; Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
Sent: Monday, July 20, 2020 18:34
To: jian xu <jamesxu@outlook.com>
Cc: MichaelDBA <MichaelDBA@sqlexec.com>; Scott Ribe <scott_ribe@elevated-dev.com>; Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
jian xu <jamesxu@outlook.com> writes:
> we are seeing this issue again, the checkpoint process is eating up all the memory,
> based on the htop, the checkpointer process has
> VIRT: 103G
> RES: 103G
> SHR: 4099M
hmmm ...
> my pg version is 10.3 ,
Really the *first* thing you ought to do is update to current (10.13)
and see if the problem is still there. But if it is, maybe collecting
a memory context map in that process would be informative:
https://wiki.postgresql.org/wiki/Developer_FAQ#Examining_backend_memory_use
Note you may need a debug build for that recipe to work.
regards, tom lane
> we are seeing this issue again, the checkpoint process is eating up all the memory,
> based on the htop, the checkpointer process has
> VIRT: 103G
> RES: 103G
> SHR: 4099M
hmmm ...
> my pg version is 10.3 ,
Really the *first* thing you ought to do is update to current (10.13)
and see if the problem is still there. But if it is, maybe collecting
a memory context map in that process would be informative:
https://wiki.postgresql.org/wiki/Developer_FAQ#Examining_backend_memory_use
Note you may need a debug build for that recipe to work.
regards, tom lane
btw, I checked /proc/<PID>/smaps. it looks like postgres process took most of the memory, the output is like:
===============================================================================
564415881000-56441588f000 rw-p 00857000 08:04 17682620 postgresql/10/3/dist/bin/postgres
Size: 56 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 56 kB
Pss: 56 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 56 kB
Referenced: 56 kB
Anonymous: 56 kB
LazyFree: 0 kB
AnonHugePages: 0 kB
ShmemPmdMapped: 0 kB
Shared_Hugetlb: 0 kB
Private_Hugetlb: 0 kB
Swap: 0 kB
SwapPss: 0 kB
Locked: 0 kB
VmFlags: rd wr mr mw me dw ac sd
5644162cf000-565d4c645000 rw-p 00000000 00:00 0 [heap]Size: 105745880 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 105745692 kB
Pss: 105745692 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 105745692 kB
Referenced: 104768404 kB
Anonymous: 105745692 kB
LazyFree: 0 kB
AnonHugePages: 0 kB
ShmemPmdMapped: 0 kB
Shared_Hugetlb: 0 kB
Private_Hugetlb: 0 kB
Swap: 0 kB
SwapPss: 0 kB
Locked: 0 kB
VmFlags: rd wr mr mw me ac sd
===============================================================================
however, the shared_buffers size is only 4GB..
postgres=# show shared_buffers ;
thanks, shared_buffers
----------------
4GB
James
From: jian xu <jamesxu@outlook.com>
Sent: Tuesday, July 21, 2020 9:20
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: MichaelDBA <MichaelDBA@sqlexec.com>; Scott Ribe <scott_ribe@elevated-dev.com>; Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
Sent: Tuesday, July 21, 2020 9:20
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: MichaelDBA <MichaelDBA@sqlexec.com>; Scott Ribe <scott_ribe@elevated-dev.com>; Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
Thanks Tom and Alvaro. I will upgrade it to 10.13 first.
James
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Monday, July 20, 2020 18:34
To: jian xu <jamesxu@outlook.com>
Cc: MichaelDBA <MichaelDBA@sqlexec.com>; Scott Ribe <scott_ribe@elevated-dev.com>; Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
Sent: Monday, July 20, 2020 18:34
To: jian xu <jamesxu@outlook.com>
Cc: MichaelDBA <MichaelDBA@sqlexec.com>; Scott Ribe <scott_ribe@elevated-dev.com>; Laurenz Albe <laurenz.albe@cybertec.at>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: checkpoint process use too much memory
jian xu <jamesxu@outlook.com> writes:
> we are seeing this issue again, the checkpoint process is eating up all the memory,
> based on the htop, the checkpointer process has
> VIRT: 103G
> RES: 103G
> SHR: 4099M
hmmm ...
> my pg version is 10.3 ,
Really the *first* thing you ought to do is update to current (10.13)
and see if the problem is still there. But if it is, maybe collecting
a memory context map in that process would be informative:
https://wiki.postgresql.org/wiki/Developer_FAQ#Examining_backend_memory_use
Note you may need a debug build for that recipe to work.
regards, tom lane
> we are seeing this issue again, the checkpoint process is eating up all the memory,
> based on the htop, the checkpointer process has
> VIRT: 103G
> RES: 103G
> SHR: 4099M
hmmm ...
> my pg version is 10.3 ,
Really the *first* thing you ought to do is update to current (10.13)
and see if the problem is still there. But if it is, maybe collecting
a memory context map in that process would be informative:
https://wiki.postgresql.org/wiki/Developer_FAQ#Examining_backend_memory_use
Note you may need a debug build for that recipe to work.
regards, tom lane