Thread: Memory leak during delete with sequential scan

Memory leak during delete with sequential scan

From
Roman Konoval
Date:
Hi,

On one of our production servers postgres backend processes are killed by
linux OOM killer. It runs 9.1.13 version now.

Using the method described here
http://www.depesz.com/2012/06/09/how-much-ram-is-postgresql-using/ I was
able to find out that processes which are killed use a lot of private
memory ~3Gb. It turns out that memory consumption grows when a particular
query is executed namely delete with filtering when sequential scan is
used.

The simplified test which demonstrates this problem.

$ psql -U postgres postgres
psql (9.1.11, server 9.3.5)

postgres=# create table t (i integer, t text);
CREATE TABLE

postgres=# insert into t select v,
'01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789'
|| text(v) from generate_series(1,4000000) v;
INSERT 0 4000000

postgres=# \q

restart postgres

$ psql -U postgres postgres

execute 500 times the following command in the same connection
postgres=# delete from t where t = '';

On my setup of 9.3.5 this causes postgres backend process to consume 130Mb
of private memory which seems is reclaimed only when connection is closed.
On 9.1.13 this consumes about 250Mb.

I was able to find that this problem happens on 9.1.13, 9.2.9 and 9.3.5.
Note that this doesn't happen on latest 9.1.14 - memory usage is about 26
Mb after the above scenario.
On 9.4beta2 the memory consumption grows very much like on 9.3.5 but after
about a minute of connection being IDLE it decreases.

If I add index on the field which is used for filtering private memory
consumption is much smaller (2-10 Mb).

This problem can be reliably reproducible only after restart of postgres.
After some time when some statements are executed on the database the
problem sometimes gone.

Using of a connection pool makes this problem more painful.
My initial intention was to upgrade to 9.3 but it also has this problem.

This looks very much like a bug for me which affects latest stable version
(9.3.5) but I'm not sure that my method to diagnose the problem is correct.
Should I fine a bug regarding this?

Regards,
Roman Konoval

Re: Memory leak during delete with sequential scan

From
Tom Lane
Date:
Roman Konoval <rkonoval@gmail.com> writes:
> The simplified test which demonstrates this problem.

> $ psql -U postgres postgres
> psql (9.1.11, server 9.3.5)

> postgres=# create table t (i integer, t text);
> CREATE TABLE

> postgres=# insert into t select v,
> '01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789'
> || text(v) from generate_series(1,4000000) v;
> INSERT 0 4000000

> postgres=# \q

> restart postgres

> $ psql -U postgres postgres

> execute 500 times the following command in the same connection
> postgres=# delete from t where t = '';

I see absolutely no leak from this example in any existing release branch.

> This problem can be reliably reproducible only after restart of postgres.

That sounds suspiciously like what you are counting is a process's
accesses to shared memory.  You did not say what shared_buffers setting
you're using, but if the "leak" tops out at something close to your
shared_buffers setting then that's almost certainly what it is.
In Linux systems you should generally be looking at RES minus SHR
not just RES to determine a process' private memory.

            regards, tom lane

Re: Memory leak during delete with sequential scan

From
Roman Konoval
Date:
On Fri, Sep 12, 2014 at 5:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Roman Konoval <rkonoval@gmail.com> writes:
>
>
> > This problem can be reliably reproducible only after restart of postgres.
>
> That sounds suspiciously like what you are counting is a process's
> accesses to shared memory.  You did not say what shared_buffers setting
> you're using, but if the "leak" tops out at something close to your
> shared_buffers setting then that's almost certainly what it is.
> In Linux systems you should generally be looking at RES minus SHR
> not just RES to determine a process' private memory.
>
>

Thanks for your reply, Tom.

Your guess is correct I think.
I'm using default shared_buffers settings. Different version of postgres
have different default shared_buffers setting that's why I get different
results. And the amount of memory I see is very much correlated with
shared_buffers size:

version  shared_buffers  max private_memory
9.1.13     320Mb               260Mb
9.1.14     24Mb                 26Mb
9.3.5       128Mb               128Mb
9.4beta2 128Mb               128Mb

By private memory here I mean the sum of Private_Dirty and Private_Clean
values for every memory segment in /proc/<pid>/smaps. The main portion of
the memory is mapped to segment associated with file /SYSV0052e2c1. I
suppose this is file used by postgres to share between processes.

This is how it looks for one process:
7fa461c56000-7fa46a8e2000 rw-s 00000000 00:04 32768
 /SYSV0052e2c1 (deleted)
Size:             143920 kB
Rss:              131876 kB
Pss:              130077 kB
Shared_Clean:          0 kB
Shared_Dirty:       3188 kB
Private_Clean:         0 kB
Private_Dirty:    128688 kB
Referenced:       131876 kB
Anonymous:             0 kB
AnonHugePages:         0 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Locked:                0 kB

When I run several processed executing the same query the memory is moved
to shared:
7fa461c56000-7fa46a8e2000 rw-s 00000000 00:04 32768
 /SYSV0052e2c1 (deleted)
Size:             143920 kB
Rss:              131876 kB
Pss:               65270 kB
Shared_Clean:          0 kB
Shared_Dirty:     131872 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
Referenced:       130880 kB
Anonymous:             0 kB
AnonHugePages:         0 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Locked:                0 kB

So I was confusing private memory of the process with the portion of shared
memory modified by this process alone.

My understanding now is that private memory of the process is only the one
associated with heap and stack mappings:

7fa470bbf000-7fa470c93000 rw-p 00000000 00:00 0
 [heap]
Size:                848 kB
Rss:                 648 kB
Pss:                 648 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:       648 kB
Referenced:          648 kB
Anonymous:           648 kB
AnonHugePages:         0 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Locked:                0 kB
VmFlags: rd wr mr mw me ac
7fffae0f3000-7fffae122000 rw-p 00000000 00:00 0
 [stack]
Size:                192 kB
Rss:                 148 kB
Pss:                  38 kB
Shared_Clean:          0 kB
Shared_Dirty:        128 kB
Private_Clean:         0 kB
Private_Dirty:        20 kB
Referenced:           20 kB
Anonymous:           148 kB
AnonHugePages:         0 kB
Swap:                  0 kB


KernelPageSize:        4 kB
MMUPageSize:           4 kB
Locked:                0 kB


                        regards, tom lane
>

Re: Memory leak during delete with sequential scan

From
Tom Lane
Date:
Roman Konoval <rkonoval@gmail.com> writes:
> On Fri, Sep 12, 2014 at 5:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> In Linux systems you should generally be looking at RES minus SHR
>> not just RES to determine a process' private memory.

> By private memory here I mean the sum of Private_Dirty and Private_Clean
> values for every memory segment in /proc/<pid>/smaps.

Hm.  I'm not terribly familiar with that API, but the notion that it
*ever* counts shared memory as "Private" sounds pretty bogus from here.
I'd suggest filing a bug against whichever kernel you're using.

            regards, tom lane

Re: Memory leak during delete with sequential scan

From
Roman Konoval
Date:
On Fri, Sep 12, 2014 at 4:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Roman Konoval <rkonoval@gmail.com> writes:
> > By private memory here I mean the sum of Private_Dirty and Private_Clean
> > values for every memory segment in /proc/<pid>/smaps.
>
> Hm.  I'm not terribly familiar with that API, but the notion that it
> *ever* counts shared memory as "Private" sounds pretty bogus from here.
> I'd suggest filing a bug against whichever kernel you're using.
>

This observation is doubled by a side note in this answer
http://unix.stackexchange.com/questions/33381/getting-information-about-a-process-memory-usage-from-proc-pid-smaps


> Note that a "share-able" page is counted as a private mapping until it is
*actually* shared. i.e. if there is only one process currently using libfoo
,
> that library's text section will appear in the process's *private* mappings.
It will be accounted in the shared mappings (and removed from the private
> ones) only if/when another process starts using that library.

It looks like this is by design and for kernel there is distinction between
share-able and shared.

Regards,
Roman Konoval