Thread: [report] memory leaks in COPY FROM on partitioned table
Hello, In PG11beta2, my backend process gets terminated during COPY FROM of large text file (544GB) on partitioned table. The kernel log says OOM Killer send SIGKILL due to memory pressure. In fact, 63GB of system physical 64GB was consumed by the PostgreSQL backend just before the termination. OOM Killer says: [69267.687791] Out of memory: Kill process 23881 (postgres) score 977 or sacrifice child [69267.687860] Killed process 23881 (postgres) total-vm:105210868kB, anon-rss:63372320kB, file-rss:0kB, shmem-rss:126144kB Configurations are below: The 'lineorder' partition table has three child tables by hash-partitioning on lo_orderkey (numeric). Each child table has its own tablespace. 'lineorder_p0' is built on the tablespace 'nvme0' on behalf of independent SSD device for instance. The query I run is: copy lineorder from '/opt/nvme0/ssbm/lineorder.tbl' delimiter '|'; So, its contents shall be distributed to individual child tables, based on in-database evaluation of hash-keys. To track which memory-context consumes too much memory more than usual expectations, I put elog() to report dying message prior to OOM Killer. See "aset-track.patch". It raises a warning message when memory consumption per memory-context goes across watermark. It says 'PortalContext' consumed 25GB at 04:26, then it grows up to 34GB at 05:21, and terminated at 05:51. It looks to me somewhere allocates memory our of per-tuple memory context, but I'm still under the investigation. Any ideas? ------------ 2018-07-25 04:26:54.096 JST [23881] WARNING: memory context 'PortalContext' grows up 25769803784 bytes 2018-07-25 04:26:54.096 JST [23881] CONTEXT: COPY lineorder, line 1610626836: "1610614887|1|18487099|541334|1474684|19980523|3-MEDIUM|0|30|4125930|30528526|2|4043411|82518|8|19980..." WARNING: memory context 'PortalContext' grows up 25769803784 bytes 2018-07-25 04:27:07.202 JST [23865] LOG: checkpoints are occurring too frequently (25 seconds apart) : <snip> : 2018-07-25 05:21:22.423 JST [23881] WARNING: memory context 'PortalContext' grows up 34359738384 bytes 2018-07-25 05:21:22.423 JST [23881] CONTEXT: COPY lineorder, line 2147497762: "2147498439|7|910553|962168|773580|19971006|1-URGENT|0|46|5658552|38894795|1|5601966|73807|2|19971201..." : <snip> : 2018-07-25 05:51:07.264 JST [23837] LOG: server process (PID 23881) was terminated by signal 9: Killed 2018-07-25 05:51:07.264 JST [23837] DETAIL: Failed process was running: copy lineorder from '/opt/nvme0/ssbm/lineorder.tbl' delimiter '|'; ------------ -- HeteroDB, Inc / The PG-Strom Project KaiGai Kohei <kaigai@heterodb.com>
Attachment
On Tue, Jul 24, 2018 at 09:41:52AM +0900, Kohei KaiGai wrote: > In PG11beta2, my backend process gets terminated during COPY FROM of > large text file (544GB) on partitioned table. > The kernel log says OOM Killer send SIGKILL due to memory pressure. > In fact, 63GB of system physical 64GB was consumed by the PostgreSQL > backend just before the termination. Hmm.. That's not nice. Let's add an open item. -- Michael
Attachment
Further investigation I did: CopyFrom() calls ExecFindPartition() to identify the destination child table of partitioned table. Then, it internally calls get_partition_for_tuple() to get partition index according to the key value. This invocation is not under the per-tuple context. In case of hash-partitioning, get_partition_for_tuple() calls hash-function of key data type; which is hash_numeric in my case. The hash_numeric fetches the key value using PG_GETARG_NUMERIC(0). It internally calls pg_detoast_datum() which may allocate new memory if varlena datum is not uncompressed long (32bit) format. Once this patch attached, PostgreSQL backend process has been working with about 130MB memory consumption for 20min right now (not finished yet...) Before the patch applied, its memory consumption grows up about 10BM/sec, then terminated a few hours later. P.S, I think do_convert_tuple() in ExecFindPartition() and ConvertPartitionTupleSlot() may also allocate memory out of the per-tuple context, however, I could not confirmed yet, because my test case has TupleConversionMap == NULL. Thanks, 2018-07-24 10:43 GMT+09:00 Michael Paquier <michael@paquier.xyz>: > On Tue, Jul 24, 2018 at 09:41:52AM +0900, Kohei KaiGai wrote: >> In PG11beta2, my backend process gets terminated during COPY FROM of >> large text file (544GB) on partitioned table. >> The kernel log says OOM Killer send SIGKILL due to memory pressure. >> In fact, 63GB of system physical 64GB was consumed by the PostgreSQL >> backend just before the termination. > > Hmm.. That's not nice. Let's add an open item. > -- > Michael -- HeteroDB, Inc / The PG-Strom Project KaiGai Kohei <kaigai@heterodb.com>
Attachment
Hi Kaigai-san, Thanks for the report and the patch. On 2018/07/24 11:43, Kohei KaiGai wrote: > Further investigation I did: > > CopyFrom() calls ExecFindPartition() to identify the destination child > table of partitioned table. > Then, it internally calls get_partition_for_tuple() to get partition > index according to the key value. > This invocation is not under the per-tuple context. > > In case of hash-partitioning, get_partition_for_tuple() calls > hash-function of key data type; which is hash_numeric in my case. > The hash_numeric fetches the key value using PG_GETARG_NUMERIC(0). It > internally calls pg_detoast_datum() which may allocate new memory if > varlena datum is not uncompressed long (32bit) format. > > Once this patch attached, PostgreSQL backend process has been working > with about 130MB memory consumption for 20min right now (not finished > yet...) > Before the patch applied, its memory consumption grows up about > 10BM/sec, then terminated a few hours later. > > P.S, > I think do_convert_tuple() in ExecFindPartition() and > ConvertPartitionTupleSlot() may also allocate memory out of the > per-tuple context, however, I could not confirmed yet, because my test > case has TupleConversionMap == NULL. Your patch takes care of allocation happening inside get_partition_for_tuple, but as you mention there might be others in its caller ExecFindPartition. So, I think we should switch to the per-tuple context in ExecFindPartition. When I tried to do that, I discovered that we have to be careful about releasing some of the memory that's allocated in ExecFindPartition ourselves instead of relying on the reset of per-tuple context to take care of it. That's because some of the structures that ExecFindPartition assigns the allocated memory to are cleaned up at the end of the query, by when it's too late to try to release per-tuple memory. So, the patch I ended up with is slightly bigger than simply adding a MemoryContextSwitchTo() call at the beginning of ExecFindPartition. Please find it attached. Thanks, Amit
Attachment
On 2018-Jul-24, Amit Langote wrote: > Your patch takes care of allocation happening inside > get_partition_for_tuple, but as you mention there might be others in its > caller ExecFindPartition. So, I think we should switch to the per-tuple > context in ExecFindPartition. Right, makes sense. Pushed that way. I also moved the ExecFetchSlotTuple call to happen after the memcxt change, because it seemed to me that it may be possible for tuple_expand to allocate memory (if not, it's not obvious). I also reworded some comments -- hope not to have broken anything too bad there. I also renamed variable "parent", which confused the heck out of me. I had conflicts when applying this in master after developing it in pg11, because of some new development there (and my variable rename). I really hope we don't open the pg13 tree as early as we opened the pg12 one ... > When I tried to do that, I discovered that we have to be careful about > releasing some of the memory that's allocated in ExecFindPartition > ourselves instead of relying on the reset of per-tuple context to take > care of it. That's because some of the structures that ExecFindPartition > assigns the allocated memory to are cleaned up at the end of the query, by > when it's too late to try to release per-tuple memory. So, the patch I > ended up with is slightly bigger than simply adding a > MemoryContextSwitchTo() call at the beginning of ExecFindPartition. Yeah, that stuff looks a bit brittle. I wish I had an idea on how to make it less so. Thanks for taking care of that. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018-Aug-01, Alvaro Herrera wrote: > Right, makes sense. Pushed that way. KaiGai, if you can please confirm that the pushed change fixes your test case, I'd appreciate it. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018-08-01 16:38:11 -0400, Alvaro Herrera wrote: > On 2018-Jul-24, Amit Langote wrote: > > > Your patch takes care of allocation happening inside > > get_partition_for_tuple, but as you mention there might be others in its > > caller ExecFindPartition. So, I think we should switch to the per-tuple > > context in ExecFindPartition. > > Right, makes sense. Pushed that way. The buildfarm does not like this one: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2018-08-01%2020%3A40%3A02 (and a lot of other reports from my animals) ================== stack trace: pgsql.build/src/test/regress/tmp_check/data/core ================== [New LWP 10463] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `postgres: bf regression [local] INSERT '. Program terminated with signal SIGSEGV, Segmentation fault. #0 slot_getattr (slot=slot@entry=0x5606e413dd90, attnum=1, isnull=isnull@entry=0x7ffcb8c35570) at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/access/common/heaptuple.c:1569 1569 if (attnum > HeapTupleHeaderGetNatts(tup)) #0 slot_getattr (slot=slot@entry=0x5606e413dd90, attnum=1, isnull=isnull@entry=0x7ffcb8c35570) at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/access/common/heaptuple.c:1569 #1 0x00005606e1fe5744 in FormPartitionKeyDatum (pd=0x5606e4140670, pd=0x5606e4140670, isnull=0x7ffcb8c35590, values=0x7ffcb8c355b0,estate=0x5606e419f8d8, slot=0x5606e413dd90) at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/executor/execPartition.c:1077 #2 ExecFindPartition (resultRelInfo=resultRelInfo@entry=0x5606e419fb28, pd=0x5606e4140888, slot=0x5606e413dd90, estate=estate@entry=0x5606e419f8d8)at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/executor/execPartition.c:245 #3 0x00005606e2005452 in ExecPrepareTupleRouting (mtstate=mtstate@entry=0x5606e419fc40, estate=estate@entry=0x5606e419f8d8,proute=proute@entry=0x5606e413daf0, targetRelInfo=targetRelInfo@entry=0x5606e419fb28,slot=<optimized out>) at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/executor/nodeModifyTable.c:1713 #4 0x00005606e20076ff in ExecModifyTable (pstate=0x5606e419fc40) at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/executor/nodeModifyTable.c:2159 #5 0x00005606e1fe008a in ExecProcNode (node=0x5606e419fc40) at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/include/executor/executor.h:237 #6 ExecutePlan (execute_once=<optimized out>, dest=0x5606e4365e10, direction=<optimized out>, numberTuples=0, sendTuples=<optimizedout>, operation=CMD_INSERT, use_parallel_mode=<optimized out>, planstate=0x5606e419fc40, estate=0x5606e419f8d8)at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/executor/execMain.c:1721 #7 standard_ExecutorRun (queryDesc=0x5606e40addf8, direction=<optimized out>, count=0, execute_once=<optimized out>) at/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/executor/execMain.c:362 #8 0x00005606e2142ff2 in ProcessQuery (plan=<optimized out>, sourceText=0x5606e3ff6868 "INSERT INTO pagg_tab_ml SELECT i% 30, i % 10, to_char(i % 4, 'FM0000') FROM generate_series(0, 29999) i;", params=0x0, queryEnv=0x0, dest=0x5606e4365e10,completionTag=0x7ffcb8c35ac0 "") at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/tcop/pquery.c:161 #9 0x00005606e214326b in PortalRunMulti (portal=portal@entry=0x5606e405cf08, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false,dest=dest@entry=0x5606e4365e10, altdest=altdest@entry=0x5606e4365e10, completionTag=completionTag@entry=0x7ffcb8c35ac0"") at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/tcop/pquery.c:1286 #10 0x00005606e2143efd in PortalRun (portal=portal@entry=0x5606e405cf08, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,run_once=run_once@entry=true, dest=dest@entry=0x5606e4365e10, altdest=altdest@entry=0x5606e4365e10,completionTag=0x7ffcb8c35ac0 "") at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/tcop/pquery.c:799 #11 0x00005606e213f8fa in exec_simple_query (query_string=0x5606e3ff6868 "INSERT INTO pagg_tab_ml SELECT i % 30, i % 10,to_char(i % 4, 'FM0000') FROM generate_series(0, 29999) i;") at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/tcop/postgres.c:1122 #12 0x00005606e214185f in PostgresMain (argc=<optimized out>, argv=argv@entry=0x5606e4021688, dbname=<optimized out>, username=<optimizedout>) at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/tcop/postgres.c:4153 #13 0x00005606e1e242cb in BackendRun (port=0x5606e401a530) at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4361 #14 BackendStartup (port=0x5606e401a530) at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4033 #15 ServerLoop () at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:1706 #16 0x00005606e20c361d in PostmasterMain (argc=8, argv=0x5606e3ff24c0) at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:1379 #17 0x00005606e1e25fc3 in main (argc=8, argv=0x5606e3ff24c0) at /home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:228 Greetings, Andres Freund
On 2018-Aug-01, Andres Freund wrote: > On 2018-08-01 16:38:11 -0400, Alvaro Herrera wrote: > > On 2018-Jul-24, Amit Langote wrote: > > > > > Your patch takes care of allocation happening inside > > > get_partition_for_tuple, but as you mention there might be others in its > > > caller ExecFindPartition. So, I think we should switch to the per-tuple > > > context in ExecFindPartition. > > > > Right, makes sense. Pushed that way. > > The buildfarm does not like this one: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2018-08-01%2020%3A40%3A02 > (and a lot of other reports from my animals) Hmm, variable shadowing ... -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018/08/02 5:38, Alvaro Herrera wrote: > On 2018-Jul-24, Amit Langote wrote: > >> Your patch takes care of allocation happening inside >> get_partition_for_tuple, but as you mention there might be others in its >> caller ExecFindPartition. So, I think we should switch to the per-tuple >> context in ExecFindPartition. > > Right, makes sense. Pushed that way. Thanks. > I also moved the > ExecFetchSlotTuple call to happen after the memcxt change, because it > seemed to me that it may be possible for tuple_expand to allocate memory > (if not, it's not obvious). Oops, you're right. > I also reworded some comments -- hope not > to have broken anything too bad there. I also renamed variable > "parent", which confused the heck out of me. TBH, "parent" had started to become distracting even for me, who gave it that name to begin with. > I had conflicts when applying this in master after developing it in > pg11, because of some new development there (and my variable rename). I > really hope we don't open the pg13 tree as early as we opened the pg12 > one ... > >> When I tried to do that, I discovered that we have to be careful about >> releasing some of the memory that's allocated in ExecFindPartition >> ourselves instead of relying on the reset of per-tuple context to take >> care of it. That's because some of the structures that ExecFindPartition >> assigns the allocated memory to are cleaned up at the end of the query, by >> when it's too late to try to release per-tuple memory. So, the patch I >> ended up with is slightly bigger than simply adding a >> MemoryContextSwitchTo() call at the beginning of ExecFindPartition. > > Yeah, that stuff looks a bit brittle. I wish I had an idea on how to > make it less so. Thanks for taking care of that. Just to recap in the light of this commit, we cannot do a full ExecDropSingleTupleTableSlot right in ExecFindPartition, because we may want to use the slot again for the next tuple. Per-tuple memory taken up by the copy of the tuple in the slot would be released by resetting per-tuple context in which it is (now) allocated, even if we didn't release it ourselves. But we also need to do some bookkeeping, such as setting the slot's tts_shouldFree to false. Hence the explicit ExecClearTuple(), which both frees the memory and does the necessary bookkeeping. Thanks, Amit
On 2018/08/02 6:03, Alvaro Herrera wrote: > On 2018-Aug-01, Andres Freund wrote: > >> On 2018-08-01 16:38:11 -0400, Alvaro Herrera wrote: >>> On 2018-Jul-24, Amit Langote wrote: >>> >>>> Your patch takes care of allocation happening inside >>>> get_partition_for_tuple, but as you mention there might be others in its >>>> caller ExecFindPartition. So, I think we should switch to the per-tuple >>>> context in ExecFindPartition. >>> >>> Right, makes sense. Pushed that way. >> >> The buildfarm does not like this one: >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2018-08-01%2020%3A40%3A02 >> (and a lot of other reports from my animals) > > Hmm, variable shadowing ... Crap, sorry about forgetting to remove that and thanks for taking care of that. Thanks, Amit
On 2018-Aug-02, Amit Langote wrote: > On 2018/08/02 6:03, Alvaro Herrera wrote: > > Hmm, variable shadowing ... > > Crap, sorry about forgetting to remove that and thanks for taking care of > that. I think it was my bug actually, while rebasing to branch master after renaming the variable. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
2018-08-02 5:38 GMT+09:00 Alvaro Herrera <alvherre@2ndquadrant.com>: > On 2018-Aug-01, Alvaro Herrera wrote: > >> Right, makes sense. Pushed that way. > > KaiGai, if you can please confirm that the pushed change fixes your test > case, I'd appreciate it. > Can you wait for a few days? I can drop the test dataset and reuse the storage once benchmark test is over.... -- HeteroDB, Inc / The PG-Strom Project KaiGai Kohei <kaigai@heterodb.com>
On 2018-Aug-03, Kohei KaiGai wrote: > 2018-08-02 5:38 GMT+09:00 Alvaro Herrera <alvherre@2ndquadrant.com>: > > On 2018-Aug-01, Alvaro Herrera wrote: > > > >> Right, makes sense. Pushed that way. > > > > KaiGai, if you can please confirm that the pushed change fixes your test > > case, I'd appreciate it. > > Can you wait for a few days? I can drop the test dataset and reuse the storage > once benchmark test is over.... Of course -- take your time. Thanks, -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
2018-08-03 12:38 GMT+09:00 Alvaro Herrera <alvherre@2ndquadrant.com>: > On 2018-Aug-03, Kohei KaiGai wrote: > >> 2018-08-02 5:38 GMT+09:00 Alvaro Herrera <alvherre@2ndquadrant.com>: >> > On 2018-Aug-01, Alvaro Herrera wrote: >> > >> >> Right, makes sense. Pushed that way. >> > >> > KaiGai, if you can please confirm that the pushed change fixes your test >> > case, I'd appreciate it. >> >> Can you wait for a few days? I can drop the test dataset and reuse the storage >> once benchmark test is over.... > > Of course -- take your time. > I could load the same data (544GB csv, 789GB heap) using COPY FROM successfully. When I reported the problem, rss usage of postgresql process increased about 10MB/s ratio, then OOM killer eliminated after a few hours. Now, it consumed about 60MB rss at the beginning of COPY FROM, and it grows up very slowly during the COPY FROM execution, then grew up to 250MB before completion. We may have another memory blocks which are not released during execution, however, I could not identify whether it is really memory leaking or not, and who's jobs. It may be an idea to put a debug code that raises a notice when MemoryContext allocates more than the threshold. Thanks, -- HeteroDB, Inc / The PG-Strom Project KaiGai Kohei <kaigai@heterodb.com>
On 2018-Aug-04, Kohei KaiGai wrote: > I could load the same data (544GB csv, 789GB heap) using COPY FROM successfully. > When I reported the problem, rss usage of postgresql process increased > about 10MB/s ratio, then OOM killer eliminated after a few hours. OK, I think we can consider this particular bug closed, then. > Now, it consumed about 60MB rss at the beginning of COPY FROM, and it > grows up very slowly during the COPY FROM execution, then grew up to > 250MB before completion. > We may have another memory blocks which are not released during > execution, however, > I could not identify whether it is really memory leaking or not, and > who's jobs. Most likely, this is a different memory leak. I sugges that one way to track this down is first figure out *which* context is the one growing, which you can see by running MemoryContextStats a few times and noting for meaningful differences. Then we can try to narrow down what is allocating stuff in that context. > It may be an idea to put a debug code that raises a notice when > MemoryContext allocates more than the threshold. I don't think this is really practical, because whatever the threshold we set, there would be some corner-case scenario where the threshold is legitimately crossed. And some memory leak scenarios that don't cross any thresholds. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
2018-08-06 1:50 GMT+09:00 Alvaro Herrera <alvherre@2ndquadrant.com>: >> Now, it consumed about 60MB rss at the beginning of COPY FROM, and it >> grows up very slowly during the COPY FROM execution, then grew up to >> 250MB before completion. >> We may have another memory blocks which are not released during >> execution, however, >> I could not identify whether it is really memory leaking or not, and >> who's jobs. > > Most likely, this is a different memory leak. > > I sugges that one way to track this down is first figure out *which* > context is the one growing, which you can see by running > MemoryContextStats a few times and noting for meaningful differences. > Then we can try to narrow down what is allocating stuff in that context. > Yes, but the hardest is identification of which memory context is growing up time by time. Once we could identify, MemoryContextStats() tells us the name of problematic context and details. Of course, above my observation is just based on rss usage of postgresql. It can increase physical page allocation by page fault on the virtual address space correctly allocated. >> It may be an idea to put a debug code that raises a notice when >> MemoryContext allocates more than the threshold. > > I don't think this is really practical, because whatever the threshold > we set, there would be some corner-case scenario where the threshold is > legitimately crossed. And some memory leak scenarios that don't cross > any thresholds. > I assume this threshold is configurable by GUC, and disabled on the default. Once a user found suspicious memory usage increase, we can set a threshold value. In above case, we may be able to see something around 120MB threshold. Thanks, -- HeteroDB, Inc / The PG-Strom Project KaiGai Kohei <kaigai@heterodb.com>
On 2018-Aug-06, Kohei KaiGai wrote: > 2018-08-06 1:50 GMT+09:00 Alvaro Herrera <alvherre@2ndquadrant.com>: > >> Now, it consumed about 60MB rss at the beginning of COPY FROM, and it > >> grows up very slowly during the COPY FROM execution, then grew up to > >> 250MB before completion. > >> We may have another memory blocks which are not released during > >> execution, however, > >> I could not identify whether it is really memory leaking or not, and > >> who's jobs. > > > > Most likely, this is a different memory leak. > > > > I sugges that one way to track this down is first figure out *which* > > context is the one growing, which you can see by running > > MemoryContextStats a few times and noting for meaningful differences. > > Then we can try to narrow down what is allocating stuff in that context. > > > Yes, but the hardest is identification of which memory context is growing > up time by time. Once we could identify, MemoryContextStats() tells us > the name of problematic context and details. Well, I was thinking you'd call MemCxtStats on TopMemoryContext and observe changes in the whole hierarchy. However ... > Of course, above my observation is just based on rss usage of postgresql. > It can increase physical page allocation by page fault on the virtual address > space correctly allocated. ... this is a good point too, and I'm not sure to what extent this problem is fixable. > >> It may be an idea to put a debug code that raises a notice when > >> MemoryContext allocates more than the threshold. > > > > I don't think this is really practical, because whatever the threshold > > we set, there would be some corner-case scenario where the threshold is > > legitimately crossed. And some memory leak scenarios that don't cross > > any thresholds. > > > I assume this threshold is configurable by GUC, and disabled on the default. > Once a user found suspicious memory usage increase, we can set a threshold > value. In above case, we may be able to see something around 120MB threshold. Okay. I suppose you'd want to improve traceability of allocations in some more general way, but I think I understand your point about the threshold. Seems overly specific, but maybe it's okay. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services