Thread: memory leak under heavy load?

memory leak under heavy load?

From

hubert depesz lubaczewski

Date:

29 November 2005, 12:58:50

hi
i think i've encountered a bug in postgresql 8.1.
yet - i'm not reallty info submitting it to -bugs, as i have no way to successfully redo it again.

basically
i have server, with dual opteron, 4g of memory, 2gb of swap. everything working under centos 4.2.

postgresql 8.1 compiled from sources using:
./configure \
    --prefix=/home/pgdba/work \
    --without-debug \
    --disable-debug \
    --with-pgport=5810 \
    --with-tcl \
    --with-perl \
    --with-python \
    --without-krb4 \
    --without-krb5 \
    --without-pam \
    --without-rendezvous \
    --with-openssl \
    --with-readline \
    --with-zlib \
    --with-gnu-ld

postgresql.conf looks like this (i removed commented lines):
listen_addresses = '*'
max_connections = 250
superuser_reserved_connections = 10
password_encryption = on
shared_buffers = 50000
temp_buffers = 1000
max_prepared_transactions = 250
work_mem = 10240
maintenance_work_mem = 131072
max_fsm_pages = 500000
max_fsm_relations = 5000
fsync = off
wal_buffers = 100
commit_delay = 1000
commit_siblings = 5
checkpoint_segments = 100
effective_cache_size = 196608
random_page_cost = 1.5
default_statistics_target = 50
log_destination = 'stderr'
redirect_stderr = on
log_directory = '/home/pgdba/logs/'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_rotation_age = 1440
log_rotation_size = 502400
log_min_duration_statement = 5000
log_connections = on
log_duration = off
log_line_prefix = '[%t] [%p] <%u@%d> '
log_statement = 'none'
stats_start_collector = on
stats_command_string = on
stats_block_level = on
stats_row_level = on
stats_reset_on_server_start = on
autovacuum = off
autovacuum_naptime = 60
autovacuum_vacuum_threshold = 1000
autovacuum_analyze_threshold = 500
autovacuum_vacuum_scale_factor = 0.4
autovacuum_analyze_scale_factor = 0.2
check_function_bodies = on
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
custom_variable_classes = 'plperl'
plperl.use_strict = true

everything works nice,
but:
i run a loop of about 400 thousands inserts in transactions - two inserts in a transaction.
in totaal i had nearly 200 000 transactions - all very fast, and with no (or very little) time between them.
insert were made to 2 distinct tables, and (maybe that's important) about 99% failed because of "unique index violation".

what i say is that postmaster user started to "eat" memory.
it allocated *all* memory (both ram and swap), and then died.
load on the machine jumped to something around 20.

it is very strange for me. since next such run didn't break the postgres, but then - another one did.
i am unable to replay the scenario with 100% guarantee it will crash the backend.
as for the inserts - there were no triggers, no rules.
not even foreign keys. encodign is utf8.

is this something that anybody else encountered? what can i do to make it possible to fix the problem?

and yes - i know thay i should do copy instead of inserts, and bigger transactions, but i would like to have the problem fixed and not a workaround. actually i did a woraround for now and it works - every 100 transactions i disconnect and reconnect again. this way it works.

depesz

Re: memory leak under heavy load?

From

hubert depesz lubaczewski

Date:

29 November 2005, 13:46:13

On 11/29/05, hubert depesz lubaczewski <depesz@gmail.com> wrote:

i think i've encountered a bug in postgresql 8.1.

now i'm nearly positive it's a bug.

i created database in this way:
CREATE DATABASE leak;
\c leak
CREATE TABLE users (id serial PRIMARY KEY, username TEXT NOT NULL DEFAULT '', password TEXT);
then made a list of "usernames": from "aaaa" to "czzz" with probability 97% - 3% are missing.
and then i COPY'ied this list into users.

then:
CREATE UNIQUE INDEX xxx on users (username);
vacuum verbose analyze;

after all of this i run this script:
#!/usr/bin/perl
use strict;
use DBI;

my $dbh = DBI->connect("dbi:Pg:dbname=leak", "pgdba", "", {"AutoCommit" => 0,
    "RaiseError" => 0, "PrintError" => 0}) or die "cannot connect\n";

for (1..1000) {
    my $i = 0;
    my $q = "zzz";
    while (1) {
        $dbh->rollback();
        $dbh->do("INSERT INTO users (username) VALUES ('$q')");
        $dbh->commit();
        $q++;
        last if $q eq 'fzzz';
        $i++;
        if (0 == $i % 1000) {
            system("ps uxf");
        }
    }
    print "one iteration done\n";
    <>;
}
$dbh->disconnect();

technically - i think that after first "iteration" memory usage should not inreate. but it does.
it does in small amounts, but still does.
can anybody test/confirm the problem?

depesz

Re: memory leak under heavy load?

From

"Jim C. Nasby"

Date:

29 November 2005, 17:50:43

Are you sure this isn't just PostgreSQL caching data?

A complete testcase would help, too (ie: whatever you used to generate
the initial data).

On Tue, Nov 29, 2005 at 06:46:06PM +0100, hubert depesz lubaczewski wrote:
> On 11/29/05, hubert depesz lubaczewski <depesz@gmail.com> wrote:
> >
> > i think i've encountered a bug in postgresql 8.1.
> >
>
> now i'm nearly positive it's a bug.
>
> i created database in this way:
> CREATE DATABASE leak;
> \c leak
> CREATE TABLE users (id serial PRIMARY KEY, username TEXT NOT NULL DEFAULT
> '', password TEXT);
> then made a list of "usernames": from "aaaa" to "czzz" with probability 97%
> - 3% are missing.
> and then i COPY'ied this list into users.
>
> then:
> CREATE UNIQUE INDEX xxx on users (username);
> vacuum verbose analyze;
>
> after all of this i run this script:
> #!/usr/bin/perl
> use strict;
> use DBI;
>
> my $dbh = DBI->connect("dbi:Pg:dbname=leak", "pgdba", "", {"AutoCommit" =>
> 0,
>     "RaiseError" => 0, "PrintError" => 0}) or die "cannot connect\n";
>
> for (1..1000) {
>     my $i = 0;
>     my $q = "zzz";
>     while (1) {
>         $dbh->rollback();
>         $dbh->do("INSERT INTO users (username) VALUES ('$q')");
>         $dbh->commit();
>         $q++;
>         last if $q eq 'fzzz';
>         $i++;
>         if (0 == $i % 1000) {
>             system("ps uxf");
>         }
>     }
>     print "one iteration done\n";
>     <>;
> }
> $dbh->disconnect();
>
>
> technically - i think that after first "iteration" memory usage should not
> inreate. but it does.
> it does in small amounts, but still does.
> can anybody test/confirm the problem?
>
> depesz

--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: memory leak under heavy load?

From

hubert depesz lubaczewski

Date:

30 November 2005, 03:29:51

On 11/29/05, Jim C. Nasby <jnasby@pervasive.com> wrote:

Are you sure this isn't just PostgreSQL caching data?

i am not sure. but what bothers me is:
i setup limit of shared memory to 50000 buffers - which gives estimate 400 megabytes. how come it ended up using 6GB ?

depesz

Re: memory leak under heavy load?

From

"surabhi.ahuja"

Date:

30 November 2005, 06:08:55

even i have observed memory leaks ... is it happening in postgres side

i can send the valgrind logs

From: pgsql-general-owner@postgresql.org on behalf of hubert depesz lubaczewski
Sent: Wed 11/30/2005 12:59 PM
To: Jim C. Nasby
Cc: PostgreSQL General
Subject: Re: [GENERAL] memory leak under heavy load?

***********************
Your mail has been scanned by InterScan VirusWall.
***********-***********

On 11/29/05, Jim C. Nasby <jnasby@pervasive.com> wrote:

Are you sure this isn't just PostgreSQL caching data?

i am not sure. but what bothers me is:
i setup limit of shared memory to 50000 buffers - which gives estimate 400 megabytes. how come it ended up using 6GB ?

depesz

Re: memory leak under heavy load?

From

"Jim C. Nasby"

Date:

30 November 2005, 13:43:22

Probably best to open up a bug...

On Wed, Nov 30, 2005 at 03:38:06PM +0530, surabhi.ahuja wrote:
> even i have observed memory leaks ... is it happening in postgres side
>
> i can send the valgrind logs
>
> ________________________________
>
> From: pgsql-general-owner@postgresql.org on behalf of hubert depesz lubaczewski
> Sent: Wed 11/30/2005 12:59 PM
> To: Jim C. Nasby
> Cc: PostgreSQL General
> Subject: Re: [GENERAL] memory leak under heavy load?
>
>
> ***********************
> Your mail has been scanned by InterScan VirusWall.
> ***********-***********
>
>
> On 11/29/05, Jim C. Nasby <jnasby@pervasive.com> wrote:
>
>     Are you sure this isn't just PostgreSQL caching data?
>
>
>
> i am not sure. but what bothers me is:
> i setup limit of shared memory to 50000 buffers - which gives estimate 400 megabytes. how come it ended up using 6GB
?
>
> depesz
>

--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: memory leak under heavy load?

From

Will Glynn

Date:

02 December 2005, 19:40:37

> hi
> i think i've encountered a bug in postgresql 8.1.
> yet - i'm not reallty info submitting it to -bugs, as i have no way to
> successfully redo it again.
>
> basically
> i have server, with dual opteron, 4g of memory, 2gb of swap.
> everything working under centos 4.2.
> ...
> what i say is that postmaster user started to "eat" memory.
> it allocated *all* memory (both ram and swap), and then died.
> load on the machine jumped to something around 20.

I noticed a similar occurrence. We have a high-load PostgreSQL database
-- not a ridiculous amount of inserts or updates, but a huge variety of
diverse queries on some 200 tables.

We had noticed load averages of 3-4 on our database for the past couple
days. Then, this morning, Postgres got killed twice by the Linux
out-of-memory process killer. (Also on a dual Opteron, 4GB of memory.)
We were showing 3.5 GB of memory allocated to *something*, but stopping
Postgres completely for a few seconds didn't lower the number. It wasn't
taken by any process, which leads me to believe that it's a kernel bug.
One reboot later, everything is rosy -- load hovers around 1.2, there's
enough free memory to have a 2.5 GB buffer cache, and swap is untouched.

PostgreSQL 7.4 had run on this box flawlessly for six months -- bad RAM
forced us to take it down -- then again for another month until we
upgraded to 8.1 last week. Like the original poster, we're set up for
~500 MB of shared memory; certainly not enough to make the kernel kill
-9 postmaster. Kernel is 2.6.11-gentoo-r6, same as before the upgrade.

Also, this didn't happen in our test environment, which uses a similar
but x86 server. Perhaps this is AMD64 related?

--Will Glynn
Freedom Healthcare

Re: memory leak under heavy load?

From

Tyler MacDonald

Date:

03 December 2005, 12:37:38

Will Glynn <wglynn@freedomhealthcare.org> wrote:
> Postgres completely for a few seconds didn't lower the number. It wasn't
> taken by any process, which leads me to believe that it's a kernel bug.

    If it was a shared memory segment allocated a particular way (I
*think* it's "shm_open", I'm not 100% sure), it's not erronious for the
kernel to leave it behind after all processes are gone... see
http://lists.debian.org/debian-apache/2004/06/msg00188.html .

    If postgres needs this much shared memory and wants it to go away on
a crash, I think (again, I'm a neophyte at this still, I havent even fixed
mod_bt fo rthis yet) that an mmap()ed file is the way to go... but then
don't you need enough harddrive space to support your shared memory?

    I don't know, this whole things confusing...

        - Tyler

Re: memory leak under heavy load?

From

Martijn van Oosterhout

Date:

03 December 2005, 16:04:05

On Fri, Dec 02, 2005 at 03:53:20PM -0800, Tyler MacDonald wrote:
> Will Glynn <wglynn@freedomhealthcare.org> wrote:
> > Postgres completely for a few seconds didn't lower the number. It wasn't
> > taken by any process, which leads me to believe that it's a kernel bug.
>
>     If it was a shared memory segment allocated a particular way (I
> *think* it's "shm_open", I'm not 100% sure), it's not erronious for the
> kernel to leave it behind after all processes are gone... see
> http://lists.debian.org/debian-apache/2004/06/msg00188.html .

But this shouldn't be an issue here. If you set the IPC_RMID flag then
the kernel should remove the segment when all users go away. This is
standard IPC behaviour and is documentated in the manpage...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

msg-32091-108498.dat

Re: memory leak under heavy load?

From

Kathy Lo

Date:

08 December 2005, 00:29:16

> But this shouldn't be an issue here. If you set the IPC_RMID flag then
> the kernel should remove the segment when all users go away. This is
> standard IPC behaviour and is documentated in the manpage...
>

Would you please tell me where to find the manpage and how to set IPC_RMID flag?

--
Kathy Lo

Re: memory leak under heavy load?

From

Martijn van Oosterhout

Date:

08 December 2005, 07:43:28

On Thu, Dec 08, 2005 at 12:29:11PM +0800, Kathy Lo wrote:
> > But this shouldn't be an issue here. If you set the IPC_RMID flag then
> > the kernel should remove the segment when all users go away. This is
> > standard IPC behaviour and is documentated in the manpage...
> >
>
> Would you please tell me where to find the manpage and how to set IPC_RMID flag?

See the shmctl() manpage:

       int shmctl(int shmid, int cmd, struct shmid_ds *buf);

One of the command ids is IPC_RMID

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

msg-32091-108789.dat

Re: memory leak under heavy load?

From

Kathy Lo

Date:

09 December 2005, 00:15:53

On 12/8/05, Martijn van Oosterhout <kleptog@svana.org> wrote:
> On Thu, Dec 08, 2005 at 12:29:11PM +0800, Kathy Lo wrote:
> > > But this shouldn't be an issue here. If you set the IPC_RMID flag then
> > > the kernel should remove the segment when all users go away. This is
> > > standard IPC behaviour and is documentated in the manpage...
> > >
> >
> > Would you please tell me where to find the manpage and how to set IPC_RMID
> flag?
>
> See the shmctl() manpage:
>
>        int shmctl(int shmid, int cmd, struct shmid_ds *buf);
>
> One of the command ids is IPC_RMID
>
Do I need to change the source code of postgresql if I want to set
IPC_RMID flag to solve this problem?

--
Kathy Lo

Re: memory leak under heavy load?

From

Martijn van Oosterhout

Date:

09 December 2005, 09:41:05

On Fri, Dec 09, 2005 at 12:15:44PM +0800, Kathy Lo wrote:
> > See the shmctl() manpage:
> >
> >        int shmctl(int shmid, int cmd, struct shmid_ds *buf);
> >
> > One of the command ids is IPC_RMID
> >
> Do I need to change the source code of postgresql if I want to set
> IPC_RMID flag to solve this problem?

No because it's completely unrelated. The amount of shared memory
doesn't vary while the system is running and it gets removed once you
restart postgres. So it can't cause you to run out of memory (unless
you allocated a truly huge amount of memory that way, but that's bad
for other reasons). At worst it's gets stuffed into swap until you next
start postgres.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

msg-32091-108833.dat