Thread: Is it a memory leak in PostgreSQL 7.4beta?
I have done some beta testing with PostgreSQL 7.4beta2. I have run a simple set of SQL statements 1 million times: -- START TRANSACTION ISOLATION LEVEL READ COMMITTED; INSERT INTO t_data (data) VALUES ('2500'); UPDATE t_data SET data = '2500' WHERE data = '2500'; DELETE FROM t_data WHERE data = '2500'; -- COMMIT; The interesting thing was that my postmaster needed around 4mb of RAM when I started running my test script using ... psql test < script.sql After about 2 1/2 hours the backend process already needed 11mb of ram. looking at the output of top you can see that it seems to be in the shared memory area: PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 28899 hs 39 19 11456 11M 10620 R N 89.8 2.9 150:23 postmaster this seems very surprising to me because I have no explanation why PostgreSQL should consume so much more memory than at the beginning of the test. There are no trigger or something like that around. The table I am working on consist of two columns (one timestamp, one int4). In addition to that I have made a test with a different set of SQL statements. I have tried 1500 concurrent transaction on my good old AMD Athlon 500 box running RedHat Linux. It worked out pretty fine. The thing I came across was that my memory consumption raised during the first two hours of my test (from about 1 1/2 gigs to 1.7 gigs ram). pretty surprising as well. does anybody have an explanation for this behaviour? Regards, Hans I have run 1500 concurrent transactions on an AMD Athlon box (RedHat 9). -- Cybertec Geschwinde u Schoenig Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria Tel: +43/2952/30706; +43/664/233 90 75 www.cybertec.at, www.postgresql.at, kernel.cybertec.at
Hans-Jürgen Schönig <hs@cybertec.at> writes: > The interesting thing was that my postmaster needed around 4mb of RAM > when I started running my test script using ... > After about 2 1/2 hours the backend process already needed 11mb of ram. Hmm. I tried create table t_data (data int4, ts timestamp default now()); followed by many repetitions of START TRANSACTION ISOLATION LEVEL READ COMMITTED; INSERT INTO t_data (data) VALUES ('2500'); UPDATE t_data SET data = '2500' WHERE data = '2500'; DELETE FROM t_data WHERE data = '2500'; COMMIT; I am seeing a slow but steady growth of the backend process on a Linux box (RHL 8.0) --- top shows it growing a few K every few seconds. But I see *zero* growth with the same test on HPUX 10.20. A possible wild card is that the Postgres build I'm using on the Linux box is compiled for profiling (-pg, no --enable-debug or --enable-cassert) whereas the HPUX build has --enable-debug and --enable-cassert but no profiling. I'm not aware that there's any known memory leakage in Linux' profiling support, though. Can anyone else reproduce this, or confirm they don't see it? What platform, and what configure options? regards, tom lane
On Sat, 30 Aug 2003, Tom Lane wrote: > Hans-Jürgen Schönig <hs@cybertec.at> writes: > > The interesting thing was that my postmaster needed around 4mb of RAM > > when I started running my test script using ... > > After about 2 1/2 hours the backend process already needed 11mb of ram. > > Hmm. I tried > > create table t_data (data int4, ts timestamp default now()); > > followed by many repetitions of > > START TRANSACTION ISOLATION LEVEL READ COMMITTED; > INSERT INTO t_data (data) VALUES ('2500'); > UPDATE t_data SET data = '2500' WHERE data = '2500'; > DELETE FROM t_data WHERE data = '2500'; > COMMIT; > > I am seeing a slow but steady growth of the backend process on a Linux > box (RHL 8.0) --- top shows it growing a few K every few seconds. > > But I see *zero* growth with the same test on HPUX 10.20. > > A possible wild card is that the Postgres build I'm using on the Linux > box is compiled for profiling (-pg, no --enable-debug or --enable-cassert) > whereas the HPUX build has --enable-debug and --enable-cassert but no > profiling. I'm not aware that there's any known memory leakage in > Linux' profiling support, though. > > Can anyone else reproduce this, or confirm they don't see it? What > platform, and what configure options? RHL9, --enable-debug, CFLAGS as -O0 Doing the above sequence many times from a script piped into psql, I'm seeing RSS increasing for the backend as it goes along about the same as yours it seems.
On Sat, 30 Aug 2003, Stephan Szabo wrote: > On Sat, 30 Aug 2003, Tom Lane wrote: > > > Hans-Jürgen Schönig <hs@cybertec.at> writes: > > > The interesting thing was that my postmaster needed around 4mb of RAM > > > when I started running my test script using ... > > > After about 2 1/2 hours the backend process already needed 11mb of ram. > > > > Hmm. I tried > > > > create table t_data (data int4, ts timestamp default now()); > > > > followed by many repetitions of > > > > START TRANSACTION ISOLATION LEVEL READ COMMITTED; > > INSERT INTO t_data (data) VALUES ('2500'); > > UPDATE t_data SET data = '2500' WHERE data = '2500'; > > DELETE FROM t_data WHERE data = '2500'; > > COMMIT; > > > > I am seeing a slow but steady growth of the backend process on a Linux > > box (RHL 8.0) --- top shows it growing a few K every few seconds. > > > > But I see *zero* growth with the same test on HPUX 10.20. > > > > A possible wild card is that the Postgres build I'm using on the Linux > > box is compiled for profiling (-pg, no --enable-debug or --enable-cassert) > > whereas the HPUX build has --enable-debug and --enable-cassert but no > > profiling. I'm not aware that there's any known memory leakage in > > Linux' profiling support, though. > > > > Can anyone else reproduce this, or confirm they don't see it? What > > platform, and what configure options? > > RHL9, --enable-debug, CFLAGS as -O0 > > Doing the above sequence many times from a script piped into psql, I'm > seeing RSS increasing for the backend as it goes along about the same as > yours it seems. I rebuild without debug, and ran just the start/insert/commit sequence over and over and noticed that on my machine it looked to grow as above but that if I let it go long enough it seemed to basically stop (or at least the growth was slow enough to go without notice as compared to the easily noticable growth before). I'm running the full sequence now, but it's going to be a while before it stops or gets up to the place where it stoped in the s/i/c sequence.
Stephan Szabo wrote: > >I rebuild without debug, and ran just the start/insert/commit sequence >over and over and noticed that on my machine it looked to grow as above >but that if I let it go long enough it seemed to basically stop (or at >least the growth was slow enough to go without notice as compared to the >easily noticable growth before). I'm running the full sequence now, but >it's going to be a while before it stops or gets up to the place where it >stoped in the s/i/c sequence. > > This is the Pg backend line from top after about 90 minutes runtime : PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 16083 postgres 17 0 9172 9172 8524 R 94.7 2.4 84:59.68 postmaster No sign of the shared growth stopping at this stage... Pg built with --disable-debug --without-zlib Platform is Linux 2.4.21+ xfs (Mandrake 9.2beta) regards Mark
Hans-Jürgen Schönig wrote: > > I can hardly imagine that the backend started working with 9mb of > memory. what did you do that PostgreSQL needed so much memory from the > beginning??? are you using the default settings? usually the > postmaster does not need more than 3mb at startup (in this scenario). Setup is completely default - i.e run initdb, and start the server after that. I am running an embedded sql program to do the test, rather than an sql script (see enclosed), not sure why/if that would make any difference. On the cautionary side, note that I am using a beta Linux distribution too. regards Mark
Attachment
Hans-Jürgen Schönig <hs@cybertec.at> writes: > I can hardly imagine that the backend started working with 9mb of > memory. what did you do that PostgreSQL needed so much memory from the > beginning??? On some platforms, "top" seems to count the Postgres shared memory block as part of the memory space of each backend. If it does so in Mark's system then that would be a pretty reasonable initial report. regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> wrote: > Hans-Jürgen Schönig <hs@cybertec.at> writes: > > I can hardly imagine that the backend started working with 9mb of > > memory. what did you do that PostgreSQL needed so much memory from the > > beginning??? > > On some platforms, "top" seems to count the Postgres shared memory block > as part of the memory space of each backend. If it does so in Mark's > system then that would be a pretty reasonable initial report. I'm leaving for vacation so I will not have time to try mpatrol and see what is going on. You can find mpatrol here: http://www.cbmamiga.demon.co.uk/mpatrol/ if some one have time may be usefull try and see the logs. Regards Gaetno Mendola
> This is the Pg backend line from top after about 90 minutes runtime : > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 16083 postgres 17 0 9172 9172 8524 R 94.7 2.4 84:59.68 postmaster > > No sign of the shared growth stopping at this stage... > > Pg built with --disable-debug --without-zlib > > Platform is Linux 2.4.21+ xfs (Mandrake 9.2beta) > > regards > > > Mark I can hardly imagine that the backend started working with 9mb of memory. what did you do that PostgreSQL needed so much memory from the beginning??? are you using the default settings? usually the postmaster does not need more than 3mb at startup (in this scenario). Regards, Hans -- Cybertec Geschwinde u Schoenig Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria Tel: +43/2952/30706; +43/664/233 90 75 www.cybertec.at, www.postgresql.at, kernel.cybertec.at
> I am seeing a slow but steady growth of the backend process on a Linux > box (RHL 8.0) --- top shows it growing a few K every few seconds. > > But I see *zero* growth with the same test on HPUX 10.20. > > A possible wild card is that the Postgres build I'm using on the Linux > box is compiled for profiling (-pg, no --enable-debug or --enable-cassert) > whereas the HPUX build has --enable-debug and --enable-cassert but no > profiling. I'm not aware that there's any known memory leakage in > Linux' profiling support, though. > > Can anyone else reproduce this, or confirm they don't see it? What > platform, and what configure options? I have tried it on our Ultra Sparc 10 running Debian (Woody). Same problem. The postmaster starts at around 2.2mb and keeps allocating memory :( Maybe I can test it on AIX within the next two weeks (still waiting for the machine to come). Regards, Hans -- Cybertec Geschwinde u Schoenig Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria Tel: +43/2952/30706; +43/664/233 90 75 www.cybertec.at, www.postgresql.at, kernel.cybertec.at
> I can hardly imagine that the backend started working with 9mb of > memory. what did you do that PostgreSQL needed so much memory from the > beginning??? are you using the default settings? usually the postmaster > does not need more than 3mb at startup (in this scenario). On FreeBSD, every one of my backends appears to occupy 20MB. That's because the SHM is added to _each_ process. Chris
Hans, You are right about the startup memory - here is the top line for a few seconds after startup : PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 10116 postgres 15 0 3816 3816 3180 R 33.8 1.0 0:01.03 postmaster seems that VIRT, RES, SHR all get the increase counted against them as time goes on (as Tom suggested, I guess its to do with how top does its accounting on this platform). Hans-Jürgen Schönig wrote: > > I can hardly imagine that the backend started working with 9mb of > memory. what did you do that PostgreSQL needed so much memory from the > beginning??? are you using the default settings? usually the > postmaster does not need more than 3mb at startup (in this scenario). > >
Have we determined there _isn't_ a memory leak problem in beta2? --------------------------------------------------------------------------- Tom Lane wrote: > Hans-Jürgen Schönig <hs@cybertec.at> writes: > > The interesting thing was that my postmaster needed around 4mb of RAM > > when I started running my test script using ... > > After about 2 1/2 hours the backend process already needed 11mb of ram. > > Hmm. I tried > > create table t_data (data int4, ts timestamp default now()); > > followed by many repetitions of > > START TRANSACTION ISOLATION LEVEL READ COMMITTED; > INSERT INTO t_data (data) VALUES ('2500'); > UPDATE t_data SET data = '2500' WHERE data = '2500'; > DELETE FROM t_data WHERE data = '2500'; > COMMIT; > > I am seeing a slow but steady growth of the backend process on a Linux > box (RHL 8.0) --- top shows it growing a few K every few seconds. > > But I see *zero* growth with the same test on HPUX 10.20. > > A possible wild card is that the Postgres build I'm using on the Linux > box is compiled for profiling (-pg, no --enable-debug or --enable-cassert) > whereas the HPUX build has --enable-debug and --enable-cassert but no > profiling. I'm not aware that there's any known memory leakage in > Linux' profiling support, though. > > Can anyone else reproduce this, or confirm they don't see it? What > platform, and what configure options? > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Have we determined there _isn't_ a memory leak problem in beta2? I am not sure. I have a suspicion that there is no real leak, but rather we are seeing some artifact of the way Linux' top(1) reports memory usage. I cannot prove that --- I can only offer the evidence that the exact same PG sources running the exact same queries on a different OS (HPUX) show no memory leak. It would be useful to hear some more reports of the test case from people with other OSes. regards, tom lane
On Wed, 10 Sep 2003 00:18:52 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote: >I have a suspicion that there is no real leak, but >rather we are seeing some artifact of the way Linux' top(1) reports >memory usage. From my experience I can confirm that. I have looked a lot at top output when I benchmarked my heaptuple header changes last year. ISTM Linux accounts a shared memory page (for whatever a page is) to the memory usage of a process as soon as the process has touched that page. 7.4 starts up with 1000 shared buffers by default, that is 8 MB of shared memory. Add some space for FSM, connections and other shared stuff, and 10 MB of shared memory is quite plausible. We've seen nobody complaining that his backend got much bigger than 11 MB. It's been more than a week since the OP posted his observation. If his backend had grown to 100 MB after 24 hours, he would have told us ... ServusManfred
Manfred Koizar <mkoi-pg@aon.at> writes: > On Wed, 10 Sep 2003 00:18:52 -0400, Tom Lane <tgl@sss.pgh.pa.us> > wrote: >> I have a suspicion that there is no real leak, but >> rather we are seeing some artifact of the way Linux' top(1) reports >> memory usage. > From my experience I can confirm that. I have looked a lot at top > output when I benchmarked my heaptuple header changes last year. ISTM > Linux accounts a shared memory page (for whatever a page is) to the > memory usage of a process as soon as the process has touched that > page. But not otherwise, huh? Okay, that would explain things. > 7.4 starts up with 1000 shared buffers by default, that is 8 MB of > shared memory. Add some space for FSM, connections and other shared > stuff, and 10 MB of shared memory is quite plausible. In fact, according to ipcs, the default shared memory segment size for CVS tip on Linux (RHL 8.0) is 10436608 bytes. I see that a freshly started backend is shown as having SHARE 1896, but a large seqscan query (which would cause it to touch all the shared buffers in fairly short order) makes the report jump to 6500. This doesn't seem to quite square with your explanation though --- surely the number should go to 8000 and change? The man page for top says these numbers are in kilobytes ... but if they were really measured in, say, 4K pages, then we'd be talking about 26M of shared memory touched, which might be plausible when you consider shared libraries. 2K pages would make the numbers even more plausible, but that seems like an unlikely page size. regards, tom lane
I said: > This doesn't seem to quite square > with your explanation though --- surely the number should go to 8000 and > change? The man page for top says these numbers are in kilobytes ... > but if they were really measured in, say, 4K pages, then we'd be talking > about 26M of shared memory touched, which might be plausible when you > consider shared libraries. Never mind --- further testing shows that top does report in kilobytes. I made a silly mistake in writing my test query that prevented it from using as many buffers as I expected. When I write something that really does use all 1000 buffers, SHARE goes to 10392, which is right about what you'd expect. So I think this mystery is solved. Back to chasing real bugs ... regards, tom lane