Thread: pg_dump on older version of postgres eating huge amounts of memory

pg_dump on older version of postgres eating huge amounts of memory

From

Steve Krall

Date:

18 March 2004, 20:37:54

I'm in a bit of a bad spot.  We have a number of remotely deployed
postgres servers.  We are very happy with the depenability of postgres,
but have recently run into a problem with pg_dump in SOME of the servers.

When doing a pg_dump databasename > file.dmp pg_dump eats up all the
memory, and is eventually killed because it takes
up a huge amount of memory.  The dump file is always 0 bytes.  It works
perfectly fine on our other server.

When we run pg_dump in verbose mode, it is trying to get the trigger
information.

Does anybody have any clue what could
be happening?  Or even how I can help determine the problem?  The machines
are only accessable to us by modem dialup.

Postgres Version 7.1.2 ( Yes, I know it's ancient - but again - these are
a number of servers remotely deployed... )
Redhat 7.1 ( see above ).

I know this is an old version of postgres, but I would really appreciate
any help I can get trying to track this down.

Steve

Re: pg_dump on older version of postgres eating huge amounts of memory

From

Martijn van Oosterhout

Date:

18 March 2004, 21:06:35

On Thu, Mar 18, 2004 at 07:37:54PM -0500, Steve Krall wrote:
>
> I'm in a bit of a bad spot.  We have a number of remotely deployed
> postgres servers.  We are very happy with the depenability of postgres,
> but have recently run into a problem with pg_dump in SOME of the servers.
>
> When doing a pg_dump databasename > file.dmp pg_dump eats up all the
> memory, and is eventually killed because it takes
> up a huge amount of memory.  The dump file is always 0 bytes.  It works
> perfectly fine on our other server.

The only reason I can remember pg_dump running out of memory was when
using the -d or -D options on large tables. This used to be an issue
but has long since been fixed.

If this is the problem, you can use a 7.4 pg_dump against a 7.1
database, the output migh not be reloadable into a 7.1 DB though.

Or don't use -d or -D.

Or this is not the problem at all...

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> If the Catholic church can survive the printing press, science fiction
> will certainly weather the advent of bookwarez.
>    http://craphound.com/ebooksneitherenorbooks.txt - Cory Doctorow

Attachment

msg-32091-77198.dat

Re: pg_dump on older version of postgres eating huge amounts of memory

From

Tom Lane

Date:

18 March 2004, 22:38:06

Steve Krall <swalker@iglou.com> writes:
> When doing a pg_dump databasename > file.dmp pg_dump eats up all the
> memory, and is eventually killed because it takes
> up a huge amount of memory.  The dump file is always 0 bytes.  It works
> perfectly fine on our other server.

> When we run pg_dump in verbose mode, it is trying to get the trigger
> information.

Can you run it with query logging turned on, so we can see what query(s)
it's issuing while it eats RAM?

            regards, tom lane

Re: pg_dump on older version of postgres eating huge

From

Steve Krall

Date:

19 March 2004, 11:52:57

I've done this with the debug level all the way up...

You can get the file here( 20 megs uncompressed, 130K compressed ):
http://www.papajohns.com/postgres/postgres.log.bz2

While this dump was running, top reported that pg_dump was taking up
around 500-550megs.  Then the machine stopped responding.

If you need anything else, just ask.

Steve

On Thu, 18 Mar 2004, Tom Lane wrote:

> Steve Krall <swalker@iglou.com> writes:
> > When doing a pg_dump databasename > file.dmp pg_dump eats up all the
> > memory, and is eventually killed because it takes
> > up a huge amount of memory.  The dump file is always 0 bytes.  It works
> > perfectly fine on our other server.
>
> > When we run pg_dump in verbose mode, it is trying to get the trigger
> > information.
>
> Can you run it with query logging turned on, so we can see what query(s)
> it's issuing while it eats RAM?
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>

Re: pg_dump on older version of postgres eating huge

From

Steve Krall

Date:

19 March 2004, 19:32:02

We do have quite a few trigger.  Some are RI triggers postgres
automagically creates.  The remaining triggers are used for things like
audit tables, or logging who updated a row etc...

I did what you suggested.  I ran pg_dump dbname > file.dmp.  Immediatly
aftwards, kill -ABRT pid.  I ended up w/ a 300meg core file.  Just viewing
the file - it has a ton of DROP TRIGGER statements.  And a few create
trigger statements.

[root@r-and-d /tmp]# strings core | grep "DROP TRIGGER" | wc -l
5450219
[root@r-and-d /tmp]# strings core | grep "CREATE TRIGGER" | wc -l
466

We have alot of triggers, but not 5.5 million :)  So, I took one table,
and decided to look at the number of triggers for that table.

[root@r-and-d /tmp]# strings core | grep "DROP TRIGGER .* \"data_owner\"" | sort | uniq | wc -l
    107
[root@r-and-d /tmp]# strings core | grep "DROP TRIGGER .* \"data_owner\"" | wc -l
 347550
[root@r-and-d /tmp]# strings core | grep "DROP TRIGGER .* \"product\"" | sort | uniq | wc -l
    151
[root@r-and-d /tmp]# strings core | grep "DROP TRIGGER .* \"product\"" | wc -l
 346696

The back trace of the core looks as you describe below:

(gdb) bt
#0  0x401d7322 in __libc_send () from /lib/i686/libc.so.6
#1  0x40020675 in pqFlush () at eval.c:41
#2  0x4001e127 in PQsendQuery () at eval.c:41
#3  0x4001eb21 in PQexec () at eval.c:41
#4  0x0804d312 in getTables () at eval.c:41
#5  0x0805283e in dumpSchema () at eval.c:41
#6  0x0804adb2 in main () at eval.c:41
#7  0x4010c177 in __libc_start_main (main=0x804a5f4 <main>, argc=2,
    ubp_av=0xbfffe204, init=0x804954c <_init>, fini=0x8059c50 <_fini>,
    rtld_fini=0x4000e184 <_dl_fini>, stack_end=0xbfffe1fc)
    at ../sysdeps/generic/libc-start.c:129
(gdb)

Again, this happens only on some of the servers...  But, we downloaded the
entire db directory from one of those servers, and ran postgres against
that directory - and we see the same results... So, it would seems that
something in the data causes pg_dump to freak out...  Is there a query
that I can run against the system catalog that would help ?

Steve

On Fri, 19 Mar 2004, Tom Lane wrote:

> Steve Krall <swalker@iglou.com> writes:
> > You can get the file here( 20 megs uncompressed, 130K compressed ):
> > http://www.papajohns.com/postgres/postgres.log.bz2
> > While this dump was running, top reported that pg_dump was taking up
> > around 500-550megs.  Then the machine stopped responding.
>
> Hmm.  The trace looks completely unexceptional --- it's just running
> through your tables collecting index and trigger info (the loop in
> getTables() in pg_dump.c).  You do seem to have rather a lot of
> triggers, but not 500 megs worth.
>
> Digging in the 7.1 source code, I notice that there is a small leak in
> this loop: the query results from the two index-related queries are
> never freed.  There should be a "PQclear(res2);" at line 2313 and
> another at line 2386.  (Each at the end of the scope of the "res2" local
> variables; the line numbers might be a bit different in 7.1.2 than in
> the 7.1.3 code I'm looking at.)  However the trace shows that these
> queries are executed a couple hundred times apiece, and the waste from
> the unfreed query results shouldn't exceed a couple K each, so this
> doesn't explain hundreds of megs of bloat either.  Still you might try
> fixing it and see if it makes a difference.
>
> The next move I can think of is to "kill -ABRT" the pg_dump run after
> it's gotten to some moderate size (50Meg at most) and then manually poke
> through the resulting core file to get a sense of what it's filling
> memory with.
>
>             regards, tom lane
>

Re: pg_dump on older version of postgres eating huge

From

Tom Lane

Date:

19 March 2004, 19:40:13

Steve Krall <swalker@iglou.com> writes:
> You can get the file here( 20 megs uncompressed, 130K compressed ):
> http://www.papajohns.com/postgres/postgres.log.bz2
> While this dump was running, top reported that pg_dump was taking up
> around 500-550megs.  Then the machine stopped responding.

Hmm.  The trace looks completely unexceptional --- it's just running
through your tables collecting index and trigger info (the loop in
getTables() in pg_dump.c).  You do seem to have rather a lot of
triggers, but not 500 megs worth.

Digging in the 7.1 source code, I notice that there is a small leak in
this loop: the query results from the two index-related queries are
never freed.  There should be a "PQclear(res2);" at line 2313 and
another at line 2386.  (Each at the end of the scope of the "res2" local
variables; the line numbers might be a bit different in 7.1.2 than in
the 7.1.3 code I'm looking at.)  However the trace shows that these
queries are executed a couple hundred times apiece, and the waste from
the unfreed query results shouldn't exceed a couple K each, so this
doesn't explain hundreds of megs of bloat either.  Still you might try
fixing it and see if it makes a difference.

The next move I can think of is to "kill -ABRT" the pg_dump run after
it's gotten to some moderate size (50Meg at most) and then manually poke
through the resulting core file to get a sense of what it's filling
memory with.

            regards, tom lane

Re: pg_dump on older version of postgres eating huge

From

Tom Lane

Date:

19 March 2004, 19:55:58

Steve Krall <swalker@iglou.com> writes:
> [root@r-and-d /tmp]# strings core | grep "DROP TRIGGER" | wc -l
> 5450219

> We have alot of triggers, but not 5.5 million :)

Looks like a smoking gun to me ...

Armed with that, I went back to the 7.1 source code, and what I see is a
huge (like O(N^2) in the number of triggers) leak in this routine.  What
it needs is a "resetPQExpBuffer(delqry);" call somewhere in the loop
over triggers.  I'd suggest putting it right before the line

                appendPQExpBuffer(delqry, "DROP TRIGGER %s ", fmtId(tgname, force_quotes));

I see the same leak in 7.2, btw, but it's gone in 7.3.  Didn't look at
prior releases ...

(It seems the reason no one noticed is that the constructed string isn't
actually *used* anyplace.  Sigh.)

            regards, tom lane

Re: pg_dump on older version of postgres eating huge

From

Steve Krall

Date:

20 March 2004, 02:14:01

I added the line, at the place you suggested.  pg_dump now works as
expected.

Thank you for taking the time to debug this!

Steve

On Fri, 19 Mar 2004, Tom Lane wrote:

> Steve Krall <swalker@iglou.com> writes:
> > [root@r-and-d /tmp]# strings core | grep "DROP TRIGGER" | wc -l
> > 5450219
>
> > We have alot of triggers, but not 5.5 million :)
>
> Looks like a smoking gun to me ...
>
> Armed with that, I went back to the 7.1 source code, and what I see is a
> huge (like O(N^2) in the number of triggers) leak in this routine.  What
> it needs is a "resetPQExpBuffer(delqry);" call somewhere in the loop
> over triggers.  I'd suggest putting it right before the line
>
>                 appendPQExpBuffer(delqry, "DROP TRIGGER %s ", fmtId(tgname, force_quotes));
>
> I see the same leak in 7.2, btw, but it's gone in 7.3.  Didn't look at
> prior releases ...
>
> (It seems the reason no one noticed is that the constructed string isn't
> actually *used* anyplace.  Sigh.)
>
>             regards, tom lane
>