Thread: [report] memory leaks in COPY FROM on partitioned table

[report] memory leaks in COPY FROM on partitioned table

From

Kohei KaiGai

Date:

24 July 2018, 03:41:52

Hello,

In PG11beta2, my backend process gets terminated during COPY FROM of
large text file (544GB) on partitioned table.
The kernel log says OOM Killer send SIGKILL due to memory pressure.
In fact, 63GB of system physical 64GB was consumed by the PostgreSQL
backend just before the termination.

OOM Killer says:
[69267.687791] Out of memory: Kill process 23881 (postgres) score 977
or sacrifice child
[69267.687860] Killed process 23881 (postgres) total-vm:105210868kB,
anon-rss:63372320kB, file-rss:0kB, shmem-rss:126144kB

Configurations are below:
The 'lineorder' partition table has three child tables by
hash-partitioning on lo_orderkey (numeric).
Each child table has its own tablespace. 'lineorder_p0' is built on
the tablespace 'nvme0' on behalf of independent SSD device for
instance.
The query I run is:
    copy lineorder from '/opt/nvme0/ssbm/lineorder.tbl' delimiter '|';

So, its contents shall be distributed to individual child tables,
based on in-database evaluation of hash-keys.

To track which memory-context consumes too much memory more than usual
expectations, I put elog() to report dying message prior to OOM
Killer.
See "aset-track.patch". It raises a warning message when memory
consumption per memory-context goes across watermark.

It says 'PortalContext' consumed 25GB at 04:26, then it grows up to
34GB at 05:21, and terminated at 05:51.
It looks to me somewhere allocates memory our of per-tuple memory
context, but I'm still under the investigation.

Any ideas?
------------
2018-07-25 04:26:54.096 JST [23881] WARNING:  memory context
'PortalContext' grows up 25769803784 bytes
2018-07-25 04:26:54.096 JST [23881] CONTEXT:  COPY lineorder, line
1610626836: "1610614887|1|18487099|541334|1474684|19980523|3-MEDIUM|0|30|4125930|30528526|2|4043411|82518|8|19980..."
WARNING:  memory context 'PortalContext' grows up 25769803784 bytes
2018-07-25 04:27:07.202 JST [23865] LOG:  checkpoints are occurring
too frequently (25 seconds apart)
    :
  <snip>
    :
2018-07-25 05:21:22.423 JST [23881] WARNING:  memory context
'PortalContext' grows up 34359738384 bytes
2018-07-25 05:21:22.423 JST [23881] CONTEXT:  COPY lineorder, line
2147497762: "2147498439|7|910553|962168|773580|19971006|1-URGENT|0|46|5658552|38894795|1|5601966|73807|2|19971201..."
    :
  <snip>
    :
2018-07-25 05:51:07.264 JST [23837] LOG:  server process (PID 23881)
was terminated by signal 9: Killed
2018-07-25 05:51:07.264 JST [23837] DETAIL:  Failed process was
running: copy lineorder from '/opt/nvme0/ssbm/lineorder.tbl' delimiter
'|';
------------

-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>

Attachment

Re: [report] memory leaks in COPY FROM on partitioned table

From

Michael Paquier

Date:

24 July 2018, 04:43:08

On Tue, Jul 24, 2018 at 09:41:52AM +0900, Kohei KaiGai wrote:
> In PG11beta2, my backend process gets terminated during COPY FROM of
> large text file (544GB) on partitioned table.
> The kernel log says OOM Killer send SIGKILL due to memory pressure.
> In fact, 63GB of system physical 64GB was consumed by the PostgreSQL
> backend just before the termination.

Hmm..  That's not nice.  Let's add an open item.
--
Michael

Attachment

signature.asc

Re: [report] memory leaks in COPY FROM on partitioned table

From

Kohei KaiGai

Date:

24 July 2018, 05:43:54

Further investigation I did:

CopyFrom() calls ExecFindPartition() to identify the destination child
table of partitioned table.
Then, it internally calls get_partition_for_tuple() to get partition
index according to the key value.
This invocation is not under the per-tuple context.

In case of hash-partitioning, get_partition_for_tuple() calls
hash-function of key data type; which is hash_numeric in my case.
The hash_numeric fetches the key value using PG_GETARG_NUMERIC(0). It
internally calls pg_detoast_datum() which may allocate new memory if
varlena datum is not uncompressed long (32bit) format.

Once this patch attached, PostgreSQL backend process has been working
with about 130MB memory consumption for 20min right now (not finished
yet...)
Before the patch applied, its memory consumption grows up about
10BM/sec, then terminated a few hours later.

P.S,
I think do_convert_tuple() in ExecFindPartition() and
ConvertPartitionTupleSlot() may also allocate memory out of the
per-tuple context, however, I could not confirmed yet, because my test
case has TupleConversionMap == NULL.

Thanks,

2018-07-24 10:43 GMT+09:00 Michael Paquier <michael@paquier.xyz>:
> On Tue, Jul 24, 2018 at 09:41:52AM +0900, Kohei KaiGai wrote:
>> In PG11beta2, my backend process gets terminated during COPY FROM of
>> large text file (544GB) on partitioned table.
>> The kernel log says OOM Killer send SIGKILL due to memory pressure.
>> In fact, 63GB of system physical 64GB was consumed by the PostgreSQL
>> backend just before the termination.
>
> Hmm..  That's not nice.  Let's add an open item.
> --
> Michael

-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>

Attachment

fix-leak-copyfrom.patch

Re: [report] memory leaks in COPY FROM on partitioned table

From

Amit Langote

Date:

24 July 2018, 11:50:09

Hi Kaigai-san,

Thanks for the report and the patch.

On 2018/07/24 11:43, Kohei KaiGai wrote:
> Further investigation I did:
> 
> CopyFrom() calls ExecFindPartition() to identify the destination child
> table of partitioned table.
> Then, it internally calls get_partition_for_tuple() to get partition
> index according to the key value.
> This invocation is not under the per-tuple context.
> 
> In case of hash-partitioning, get_partition_for_tuple() calls
> hash-function of key data type; which is hash_numeric in my case.
> The hash_numeric fetches the key value using PG_GETARG_NUMERIC(0). It
> internally calls pg_detoast_datum() which may allocate new memory if
> varlena datum is not uncompressed long (32bit) format.
> 
> Once this patch attached, PostgreSQL backend process has been working
> with about 130MB memory consumption for 20min right now (not finished
> yet...)
> Before the patch applied, its memory consumption grows up about
> 10BM/sec, then terminated a few hours later.
> 
> P.S,
> I think do_convert_tuple() in ExecFindPartition() and
> ConvertPartitionTupleSlot() may also allocate memory out of the
> per-tuple context, however, I could not confirmed yet, because my test
> case has TupleConversionMap == NULL.

Your patch takes care of allocation happening inside
get_partition_for_tuple, but as you mention there might be others in its
caller ExecFindPartition.  So, I think we should switch to the per-tuple
context in ExecFindPartition.

When I tried to do that, I discovered that we have to be careful about
releasing some of the memory that's allocated in ExecFindPartition
ourselves instead of relying on the reset of per-tuple context to take
care of it.  That's because some of the structures that ExecFindPartition
assigns the allocated memory to are cleaned up at the end of the query, by
when it's too late to try to release per-tuple memory.  So, the patch I
ended up with is slightly bigger than simply adding a
MemoryContextSwitchTo() call at the beginning of ExecFindPartition.
Please find it attached.

Thanks,
Amit

Attachment

v1-0001-Use-per-tuple-memory-in-ExecFindPartition.patch

Re: [report] memory leaks in COPY FROM on partitioned table

From

Alvaro Herrera

Date:

01 August 2018, 23:38:11

On 2018-Jul-24, Amit Langote wrote:

> Your patch takes care of allocation happening inside
> get_partition_for_tuple, but as you mention there might be others in its
> caller ExecFindPartition.  So, I think we should switch to the per-tuple
> context in ExecFindPartition.

Right, makes sense.  Pushed that way.  I also moved the
ExecFetchSlotTuple call to happen after the memcxt change, because it
seemed to me that it may be possible for tuple_expand to allocate memory
(if not, it's not obvious).  I also reworded some comments -- hope not
to have broken anything too bad there.  I also renamed variable
"parent", which confused the heck out of me.

I had conflicts when applying this in master after developing it in
pg11, because of some new development there (and my variable rename).  I
really hope we don't open the pg13 tree as early as we opened the pg12
one ...

> When I tried to do that, I discovered that we have to be careful about
> releasing some of the memory that's allocated in ExecFindPartition
> ourselves instead of relying on the reset of per-tuple context to take
> care of it.  That's because some of the structures that ExecFindPartition
> assigns the allocated memory to are cleaned up at the end of the query, by
> when it's too late to try to release per-tuple memory.  So, the patch I
> ended up with is slightly bigger than simply adding a
> MemoryContextSwitchTo() call at the beginning of ExecFindPartition.

Yeah, that stuff looks a bit brittle.  I wish I had an idea on how to
make it less so. Thanks for taking care of that.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [report] memory leaks in COPY FROM on partitioned table

From

Alvaro Herrera

Date:

01 August 2018, 23:38:48

On 2018-Aug-01, Alvaro Herrera wrote:

> Right, makes sense.  Pushed that way.

KaiGai, if you can please confirm that the pushed change fixes your test
case, I'd appreciate it.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [report] memory leaks in COPY FROM on partitioned table

From

Andres Freund

Date:

01 August 2018, 23:44:11

On 2018-08-01 16:38:11 -0400, Alvaro Herrera wrote:
> On 2018-Jul-24, Amit Langote wrote:
> 
> > Your patch takes care of allocation happening inside
> > get_partition_for_tuple, but as you mention there might be others in its
> > caller ExecFindPartition.  So, I think we should switch to the per-tuple
> > context in ExecFindPartition.
> 
> Right, makes sense.  Pushed that way.

The buildfarm does not like this one:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2018-08-01%2020%3A40%3A02
(and a lot of other reports from my animals)

================== stack trace: pgsql.build/src/test/regress/tmp_check/data/core ==================
[New LWP 10463]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: bf regression [local] INSERT                                        '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  slot_getattr (slot=slot@entry=0x5606e413dd90, attnum=1, isnull=isnull@entry=0x7ffcb8c35570) at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/access/common/heaptuple.c:1569
1569        if (attnum > HeapTupleHeaderGetNatts(tup))
#0  slot_getattr (slot=slot@entry=0x5606e413dd90, attnum=1, isnull=isnull@entry=0x7ffcb8c35570) at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/access/common/heaptuple.c:1569
#1  0x00005606e1fe5744 in FormPartitionKeyDatum (pd=0x5606e4140670, pd=0x5606e4140670, isnull=0x7ffcb8c35590,
values=0x7ffcb8c355b0,estate=0x5606e419f8d8, slot=0x5606e413dd90) at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/executor/execPartition.c:1077
#2  ExecFindPartition (resultRelInfo=resultRelInfo@entry=0x5606e419fb28, pd=0x5606e4140888, slot=0x5606e413dd90,
estate=estate@entry=0x5606e419f8d8)at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/executor/execPartition.c:245
#3  0x00005606e2005452 in ExecPrepareTupleRouting (mtstate=mtstate@entry=0x5606e419fc40,
estate=estate@entry=0x5606e419f8d8,proute=proute@entry=0x5606e413daf0,
targetRelInfo=targetRelInfo@entry=0x5606e419fb28,slot=<optimized out>) at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/executor/nodeModifyTable.c:1713
#4  0x00005606e20076ff in ExecModifyTable (pstate=0x5606e419fc40) at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/executor/nodeModifyTable.c:2159
#5  0x00005606e1fe008a in ExecProcNode (node=0x5606e419fc40) at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/include/executor/executor.h:237
#6  ExecutePlan (execute_once=<optimized out>, dest=0x5606e4365e10, direction=<optimized out>, numberTuples=0,
sendTuples=<optimizedout>, operation=CMD_INSERT, use_parallel_mode=<optimized out>, planstate=0x5606e419fc40,
estate=0x5606e419f8d8)at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/executor/execMain.c:1721
#7  standard_ExecutorRun (queryDesc=0x5606e40addf8, direction=<optimized out>, count=0, execute_once=<optimized out>)
at/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/executor/execMain.c:362

#8  0x00005606e2142ff2 in ProcessQuery (plan=<optimized out>, sourceText=0x5606e3ff6868 "INSERT INTO pagg_tab_ml SELECT
i% 30, i % 10, to_char(i % 4, 'FM0000') FROM generate_series(0, 29999) i;", params=0x0, queryEnv=0x0,
dest=0x5606e4365e10,completionTag=0x7ffcb8c35ac0 "") at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/tcop/pquery.c:161
#9  0x00005606e214326b in PortalRunMulti (portal=portal@entry=0x5606e405cf08, isTopLevel=isTopLevel@entry=true,
setHoldSnapshot=setHoldSnapshot@entry=false,dest=dest@entry=0x5606e4365e10, altdest=altdest@entry=0x5606e4365e10,
completionTag=completionTag@entry=0x7ffcb8c35ac0"") at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/tcop/pquery.c:1286
#10 0x00005606e2143efd in PortalRun (portal=portal@entry=0x5606e405cf08, count=count@entry=9223372036854775807,
isTopLevel=isTopLevel@entry=true,run_once=run_once@entry=true, dest=dest@entry=0x5606e4365e10,
altdest=altdest@entry=0x5606e4365e10,completionTag=0x7ffcb8c35ac0 "") at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/tcop/pquery.c:799
#11 0x00005606e213f8fa in exec_simple_query (query_string=0x5606e3ff6868 "INSERT INTO pagg_tab_ml SELECT i % 30, i %
10,to_char(i % 4, 'FM0000') FROM generate_series(0, 29999) i;") at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/tcop/postgres.c:1122
#12 0x00005606e214185f in PostgresMain (argc=<optimized out>, argv=argv@entry=0x5606e4021688, dbname=<optimized out>,
username=<optimizedout>) at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/tcop/postgres.c:4153
#13 0x00005606e1e242cb in BackendRun (port=0x5606e401a530) at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4361
#14 BackendStartup (port=0x5606e401a530) at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4033
#15 ServerLoop () at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:1706
#16 0x00005606e20c361d in PostmasterMain (argc=8, argv=0x5606e3ff24c0) at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:1379
#17 0x00005606e1e25fc3 in main (argc=8, argv=0x5606e3ff24c0) at
/home/bf/build/buildfarm-desmoxytes/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:228

Greetings,

Andres Freund

Re: [report] memory leaks in COPY FROM on partitioned table

From

Alvaro Herrera

Date:

02 August 2018, 00:03:11

On 2018-Aug-01, Andres Freund wrote:

> On 2018-08-01 16:38:11 -0400, Alvaro Herrera wrote:
> > On 2018-Jul-24, Amit Langote wrote:
> > 
> > > Your patch takes care of allocation happening inside
> > > get_partition_for_tuple, but as you mention there might be others in its
> > > caller ExecFindPartition.  So, I think we should switch to the per-tuple
> > > context in ExecFindPartition.
> > 
> > Right, makes sense.  Pushed that way.
> 
> The buildfarm does not like this one:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2018-08-01%2020%3A40%3A02
> (and a lot of other reports from my animals)

Hmm, variable shadowing ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [report] memory leaks in COPY FROM on partitioned table

From

Amit Langote

Date:

02 August 2018, 04:40:00

On 2018/08/02 5:38, Alvaro Herrera wrote:
> On 2018-Jul-24, Amit Langote wrote:
> 
>> Your patch takes care of allocation happening inside
>> get_partition_for_tuple, but as you mention there might be others in its
>> caller ExecFindPartition.  So, I think we should switch to the per-tuple
>> context in ExecFindPartition.
> 
> Right, makes sense.  Pushed that way.

Thanks.

> I also moved the
> ExecFetchSlotTuple call to happen after the memcxt change, because it
> seemed to me that it may be possible for tuple_expand to allocate memory
> (if not, it's not obvious).

Oops, you're right.

> I also reworded some comments -- hope not
> to have broken anything too bad there.  I also renamed variable
> "parent", which confused the heck out of me.

TBH, "parent" had started to become distracting even for me, who gave it
that name to begin with.

> I had conflicts when applying this in master after developing it in
> pg11, because of some new development there (and my variable rename).  I
> really hope we don't open the pg13 tree as early as we opened the pg12
> one ...
> 
>> When I tried to do that, I discovered that we have to be careful about
>> releasing some of the memory that's allocated in ExecFindPartition
>> ourselves instead of relying on the reset of per-tuple context to take
>> care of it.  That's because some of the structures that ExecFindPartition
>> assigns the allocated memory to are cleaned up at the end of the query, by
>> when it's too late to try to release per-tuple memory.  So, the patch I
>> ended up with is slightly bigger than simply adding a
>> MemoryContextSwitchTo() call at the beginning of ExecFindPartition.
> 
> Yeah, that stuff looks a bit brittle.  I wish I had an idea on how to
> make it less so. Thanks for taking care of that.

Just to recap in the light of this commit, we cannot do a full
ExecDropSingleTupleTableSlot right in ExecFindPartition, because we may
want to use the slot again for the next tuple.  Per-tuple memory taken up
by the copy of the tuple in the slot would be released by resetting
per-tuple context in which it is (now) allocated, even if we didn't
release it ourselves.  But we also need to do some bookkeeping, such as
setting the slot's tts_shouldFree to false.  Hence the explicit
ExecClearTuple(), which both frees the memory and does the necessary
bookkeeping.

Thanks,
Amit

Re: [report] memory leaks in COPY FROM on partitioned table

From

Amit Langote

Date:

02 August 2018, 04:41:56

On 2018/08/02 6:03, Alvaro Herrera wrote:
> On 2018-Aug-01, Andres Freund wrote:
> 
>> On 2018-08-01 16:38:11 -0400, Alvaro Herrera wrote:
>>> On 2018-Jul-24, Amit Langote wrote:
>>>
>>>> Your patch takes care of allocation happening inside
>>>> get_partition_for_tuple, but as you mention there might be others in its
>>>> caller ExecFindPartition.  So, I think we should switch to the per-tuple
>>>> context in ExecFindPartition.
>>>
>>> Right, makes sense.  Pushed that way.
>>
>> The buildfarm does not like this one:
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2018-08-01%2020%3A40%3A02
>> (and a lot of other reports from my animals)
> 
> Hmm, variable shadowing ...

Crap, sorry about forgetting to remove that and thanks for taking care of
that.

Thanks,
Amit

Re: [report] memory leaks in COPY FROM on partitioned table

From

Alvaro Herrera

Date:

02 August 2018, 04:50:14

On 2018-Aug-02, Amit Langote wrote:

> On 2018/08/02 6:03, Alvaro Herrera wrote:

> > Hmm, variable shadowing ...
> 
> Crap, sorry about forgetting to remove that and thanks for taking care of
> that.

I think it was my bug actually, while rebasing to branch master after
renaming the variable.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [report] memory leaks in COPY FROM on partitioned table

From

Kohei KaiGai

Date:

03 August 2018, 05:26:56

2018-08-02 5:38 GMT+09:00 Alvaro Herrera <alvherre@2ndquadrant.com>:
> On 2018-Aug-01, Alvaro Herrera wrote:
>
>> Right, makes sense.  Pushed that way.
>
> KaiGai, if you can please confirm that the pushed change fixes your test
> case, I'd appreciate it.
>
Can you wait for a few days? I can drop the test dataset and reuse the storage
once benchmark test is over....
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>

Re: [report] memory leaks in COPY FROM on partitioned table

From

Alvaro Herrera

Date:

03 August 2018, 06:38:11

On 2018-Aug-03, Kohei KaiGai wrote:

> 2018-08-02 5:38 GMT+09:00 Alvaro Herrera <alvherre@2ndquadrant.com>:
> > On 2018-Aug-01, Alvaro Herrera wrote:
> >
> >> Right, makes sense.  Pushed that way.
> >
> > KaiGai, if you can please confirm that the pushed change fixes your test
> > case, I'd appreciate it.
>
> Can you wait for a few days? I can drop the test dataset and reuse the storage
> once benchmark test is over....

Of course -- take your time.

Thanks,

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [report] memory leaks in COPY FROM on partitioned table

From

Kohei KaiGai

Date:

04 August 2018, 16:29:28

2018-08-03 12:38 GMT+09:00 Alvaro Herrera <alvherre@2ndquadrant.com>:
> On 2018-Aug-03, Kohei KaiGai wrote:
>
>> 2018-08-02 5:38 GMT+09:00 Alvaro Herrera <alvherre@2ndquadrant.com>:
>> > On 2018-Aug-01, Alvaro Herrera wrote:
>> >
>> >> Right, makes sense.  Pushed that way.
>> >
>> > KaiGai, if you can please confirm that the pushed change fixes your test
>> > case, I'd appreciate it.
>>
>> Can you wait for a few days? I can drop the test dataset and reuse the storage
>> once benchmark test is over....
>
> Of course -- take your time.
>
I could load the same data (544GB csv, 789GB heap) using COPY FROM successfully.
When I reported the problem, rss usage of postgresql process increased
about 10MB/s ratio,
then OOM killer eliminated after a few hours.

Now, it consumed about 60MB rss at the beginning of COPY FROM, and it grows up
very slowly during the COPY FROM execution, then grew up to 250MB
before completion.
We may have another memory blocks which are not released during
execution, however,
I could not identify whether it is really memory leaking or not, and who's jobs.

It may be an idea to put a debug code that raises a notice when
MemoryContext allocates
more than the threshold.

Thanks,
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>

Re: [report] memory leaks in COPY FROM on partitioned table

From

Alvaro Herrera

Date:

05 August 2018, 19:50:19

On 2018-Aug-04, Kohei KaiGai wrote:

> I could load the same data (544GB csv, 789GB heap) using COPY FROM successfully.
> When I reported the problem, rss usage of postgresql process increased
> about 10MB/s ratio, then OOM killer eliminated after a few hours.

OK, I think we can consider this particular bug closed, then.

> Now, it consumed about 60MB rss at the beginning of COPY FROM, and it
> grows up very slowly during the COPY FROM execution, then grew up to
> 250MB before completion.
> We may have another memory blocks which are not released during
> execution, however,
> I could not identify whether it is really memory leaking or not, and
> who's jobs.

Most likely, this is a different memory leak.

I sugges that one way to track this down is first figure out *which*
context is the one growing, which you can see by running
MemoryContextStats a few times and noting for meaningful differences.
Then we can try to narrow down what is allocating stuff in that context.

> It may be an idea to put a debug code that raises a notice when
> MemoryContext allocates more than the threshold.

I don't think this is really practical, because whatever the threshold
we set, there would be some corner-case scenario where the threshold is
legitimately crossed.  And some memory leak scenarios that don't cross
any thresholds.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [report] memory leaks in COPY FROM on partitioned table

From

Kohei KaiGai

Date:

06 August 2018, 05:34:24

2018-08-06 1:50 GMT+09:00 Alvaro Herrera <alvherre@2ndquadrant.com>:
>> Now, it consumed about 60MB rss at the beginning of COPY FROM, and it
>> grows up very slowly during the COPY FROM execution, then grew up to
>> 250MB before completion.
>> We may have another memory blocks which are not released during
>> execution, however,
>> I could not identify whether it is really memory leaking or not, and
>> who's jobs.
>
> Most likely, this is a different memory leak.
>
> I sugges that one way to track this down is first figure out *which*
> context is the one growing, which you can see by running
> MemoryContextStats a few times and noting for meaningful differences.
> Then we can try to narrow down what is allocating stuff in that context.
>
Yes, but the hardest is identification of which memory context is growing
up time by time. Once we could identify, MemoryContextStats() tells us
the name of problematic context and details.

Of course, above my observation is just based on rss usage of postgresql.
It can increase physical page allocation by page fault on the virtual address
space correctly allocated.

>> It may be an idea to put a debug code that raises a notice when
>> MemoryContext allocates more than the threshold.
>
> I don't think this is really practical, because whatever the threshold
> we set, there would be some corner-case scenario where the threshold is
> legitimately crossed.  And some memory leak scenarios that don't cross
> any thresholds.
>
I assume this threshold is configurable by GUC, and disabled on the default.
Once a user found suspicious memory usage increase, we can set a threshold
value. In above case, we may be able to see something around 120MB threshold.

Thanks,
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>

Re: [report] memory leaks in COPY FROM on partitioned table

From

Alvaro Herrera

Date:

07 August 2018, 00:04:32

On 2018-Aug-06, Kohei KaiGai wrote:

> 2018-08-06 1:50 GMT+09:00 Alvaro Herrera <alvherre@2ndquadrant.com>:
> >> Now, it consumed about 60MB rss at the beginning of COPY FROM, and it
> >> grows up very slowly during the COPY FROM execution, then grew up to
> >> 250MB before completion.
> >> We may have another memory blocks which are not released during
> >> execution, however,
> >> I could not identify whether it is really memory leaking or not, and
> >> who's jobs.
> >
> > Most likely, this is a different memory leak.
> >
> > I sugges that one way to track this down is first figure out *which*
> > context is the one growing, which you can see by running
> > MemoryContextStats a few times and noting for meaningful differences.
> > Then we can try to narrow down what is allocating stuff in that context.
> >
> Yes, but the hardest is identification of which memory context is growing
> up time by time. Once we could identify, MemoryContextStats() tells us
> the name of problematic context and details.

Well, I was thinking you'd call MemCxtStats on TopMemoryContext and
observe changes in the whole hierarchy.  However ...

> Of course, above my observation is just based on rss usage of postgresql.
> It can increase physical page allocation by page fault on the virtual address
> space correctly allocated.

... this is a good point too, and I'm not sure to what extent this
problem is fixable.

> >> It may be an idea to put a debug code that raises a notice when
> >> MemoryContext allocates more than the threshold.
> >
> > I don't think this is really practical, because whatever the threshold
> > we set, there would be some corner-case scenario where the threshold is
> > legitimately crossed.  And some memory leak scenarios that don't cross
> > any thresholds.
> >
> I assume this threshold is configurable by GUC, and disabled on the default.
> Once a user found suspicious memory usage increase, we can set a threshold
> value. In above case, we may be able to see something around 120MB threshold.

Okay.  I suppose you'd want to improve traceability of allocations in
some more general way, but I think I understand your point about the
threshold.  Seems overly specific, but maybe it's okay.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services