Thread: Memory leak during delete with sequential scan
Hi, On one of our production servers postgres backend processes are killed by linux OOM killer. It runs 9.1.13 version now. Using the method described here http://www.depesz.com/2012/06/09/how-much-ram-is-postgresql-using/ I was able to find out that processes which are killed use a lot of private memory ~3Gb. It turns out that memory consumption grows when a particular query is executed namely delete with filtering when sequential scan is used. The simplified test which demonstrates this problem. $ psql -U postgres postgres psql (9.1.11, server 9.3.5) postgres=# create table t (i integer, t text); CREATE TABLE postgres=# insert into t select v, '01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789' || text(v) from generate_series(1,4000000) v; INSERT 0 4000000 postgres=# \q restart postgres $ psql -U postgres postgres execute 500 times the following command in the same connection postgres=# delete from t where t = ''; On my setup of 9.3.5 this causes postgres backend process to consume 130Mb of private memory which seems is reclaimed only when connection is closed. On 9.1.13 this consumes about 250Mb. I was able to find that this problem happens on 9.1.13, 9.2.9 and 9.3.5. Note that this doesn't happen on latest 9.1.14 - memory usage is about 26 Mb after the above scenario. On 9.4beta2 the memory consumption grows very much like on 9.3.5 but after about a minute of connection being IDLE it decreases. If I add index on the field which is used for filtering private memory consumption is much smaller (2-10 Mb). This problem can be reliably reproducible only after restart of postgres. After some time when some statements are executed on the database the problem sometimes gone. Using of a connection pool makes this problem more painful. My initial intention was to upgrade to 9.3 but it also has this problem. This looks very much like a bug for me which affects latest stable version (9.3.5) but I'm not sure that my method to diagnose the problem is correct. Should I fine a bug regarding this? Regards, Roman Konoval
Roman Konoval <rkonoval@gmail.com> writes: > The simplified test which demonstrates this problem. > $ psql -U postgres postgres > psql (9.1.11, server 9.3.5) > postgres=# create table t (i integer, t text); > CREATE TABLE > postgres=# insert into t select v, > '01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789' > || text(v) from generate_series(1,4000000) v; > INSERT 0 4000000 > postgres=# \q > restart postgres > $ psql -U postgres postgres > execute 500 times the following command in the same connection > postgres=# delete from t where t = ''; I see absolutely no leak from this example in any existing release branch. > This problem can be reliably reproducible only after restart of postgres. That sounds suspiciously like what you are counting is a process's accesses to shared memory. You did not say what shared_buffers setting you're using, but if the "leak" tops out at something close to your shared_buffers setting then that's almost certainly what it is. In Linux systems you should generally be looking at RES minus SHR not just RES to determine a process' private memory. regards, tom lane
On Fri, Sep 12, 2014 at 5:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Roman Konoval <rkonoval@gmail.com> writes: > > > > This problem can be reliably reproducible only after restart of postgres. > > That sounds suspiciously like what you are counting is a process's > accesses to shared memory. You did not say what shared_buffers setting > you're using, but if the "leak" tops out at something close to your > shared_buffers setting then that's almost certainly what it is. > In Linux systems you should generally be looking at RES minus SHR > not just RES to determine a process' private memory. > > Thanks for your reply, Tom. Your guess is correct I think. I'm using default shared_buffers settings. Different version of postgres have different default shared_buffers setting that's why I get different results. And the amount of memory I see is very much correlated with shared_buffers size: version shared_buffers max private_memory 9.1.13 320Mb 260Mb 9.1.14 24Mb 26Mb 9.3.5 128Mb 128Mb 9.4beta2 128Mb 128Mb By private memory here I mean the sum of Private_Dirty and Private_Clean values for every memory segment in /proc/<pid>/smaps. The main portion of the memory is mapped to segment associated with file /SYSV0052e2c1. I suppose this is file used by postgres to share between processes. This is how it looks for one process: 7fa461c56000-7fa46a8e2000 rw-s 00000000 00:04 32768 /SYSV0052e2c1 (deleted) Size: 143920 kB Rss: 131876 kB Pss: 130077 kB Shared_Clean: 0 kB Shared_Dirty: 3188 kB Private_Clean: 0 kB Private_Dirty: 128688 kB Referenced: 131876 kB Anonymous: 0 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB When I run several processed executing the same query the memory is moved to shared: 7fa461c56000-7fa46a8e2000 rw-s 00000000 00:04 32768 /SYSV0052e2c1 (deleted) Size: 143920 kB Rss: 131876 kB Pss: 65270 kB Shared_Clean: 0 kB Shared_Dirty: 131872 kB Private_Clean: 0 kB Private_Dirty: 4 kB Referenced: 130880 kB Anonymous: 0 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB So I was confusing private memory of the process with the portion of shared memory modified by this process alone. My understanding now is that private memory of the process is only the one associated with heap and stack mappings: 7fa470bbf000-7fa470c93000 rw-p 00000000 00:00 0 [heap] Size: 848 kB Rss: 648 kB Pss: 648 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 648 kB Referenced: 648 kB Anonymous: 648 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me ac 7fffae0f3000-7fffae122000 rw-p 00000000 00:00 0 [stack] Size: 192 kB Rss: 148 kB Pss: 38 kB Shared_Clean: 0 kB Shared_Dirty: 128 kB Private_Clean: 0 kB Private_Dirty: 20 kB Referenced: 20 kB Anonymous: 148 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB regards, tom lane >
Roman Konoval <rkonoval@gmail.com> writes: > On Fri, Sep 12, 2014 at 5:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> In Linux systems you should generally be looking at RES minus SHR >> not just RES to determine a process' private memory. > By private memory here I mean the sum of Private_Dirty and Private_Clean > values for every memory segment in /proc/<pid>/smaps. Hm. I'm not terribly familiar with that API, but the notion that it *ever* counts shared memory as "Private" sounds pretty bogus from here. I'd suggest filing a bug against whichever kernel you're using. regards, tom lane
On Fri, Sep 12, 2014 at 4:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Roman Konoval <rkonoval@gmail.com> writes: > > By private memory here I mean the sum of Private_Dirty and Private_Clean > > values for every memory segment in /proc/<pid>/smaps. > > Hm. I'm not terribly familiar with that API, but the notion that it > *ever* counts shared memory as "Private" sounds pretty bogus from here. > I'd suggest filing a bug against whichever kernel you're using. > This observation is doubled by a side note in this answer http://unix.stackexchange.com/questions/33381/getting-information-about-a-process-memory-usage-from-proc-pid-smaps > Note that a "share-able" page is counted as a private mapping until it is *actually* shared. i.e. if there is only one process currently using libfoo , > that library's text section will appear in the process's *private* mappings. It will be accounted in the shared mappings (and removed from the private > ones) only if/when another process starts using that library. It looks like this is by design and for kernel there is distinction between share-able and shared. Regards, Roman Konoval