Thread: Backend with closed connection at 99% CPU

Backend with closed connection at 99% CPU

From
Guy Thornley
Date:
First, I better let you know that we have an in-house patch already on our
postgres, so this may be our breakage. It only started happening recently,
though, and our patch is quite old, so it is very unlikely.

I thought I'd ask here anyway, incase this was a known bug that was fixed
already. I couldn't see anything in the release notes, however.

Postgres 7.4.1. (Yes I know, we _should_ upgrade).

          PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
        27583 postgres  15   0  163m 163m 159m R 97.2 16.2  14:36.01 postmaster

As the subject says, it is spinning at 99% CPU. Memory consumption does not
appear to be increasing.

This backend has recently lost its client connection, and it appeared after
I shutdown a bunch of JDBC connections:

        uke19:~# netstat -np | grep 27583
        tcp        1      0 127.0.0.1:5432          127.0.0.1:35175         CLOSE_WAIT 27583/postgres

This happened 3 times last week, too. Never 'idle in transaction' until last
night, however, when it managed to lose its connection while 'idle in
transaction'. This left some things locked, and I had to kill it.

I only noticed this problem after messing with some settings in the
configuration file:

        max_connections = 88
        superuser_reserved_connections = 4
        wal_buffers = 544

Around the same time, I changed the java code to close down the database
connections properly, doing conn.close() on pg connections. We are using
'postgresql-jdbc3.jar' that is in the 'libpgjava' Debian package.

You can even have a backtrace, how about that:

        (gdb) attach 27583
        Attaching to process 27583
        0x0811cf40 in enlargeStringInfo ()

        (gdb) bt
        #0  0x0811cf40 in enlargeStringInfo ()
        #1  0x081249b8 in pq_getmessage ()
        #2  0x0817bdfe in HandleFunctionRequest ()
        #3  0x0817bfda in HandleFunctionRequest ()
        #4  0x0817eacc in PostgresMain ()
        #5  0x0815877b in ClosePostmasterPorts ()
        #6  0x08158163 in ClosePostmasterPorts ()
        #7  0x08156658 in PostmasterMain ()
        #8  0x08155ce4 in PostmasterMain ()
        #9  0x08125cb6 in main ()
        #10 0x4026eda6 in __libc_start_main () from /lib/libc.so.6

I can get a stack data dump from gdb if requested, and I'll leave this
attached to gdb for now. I'll probably need to restart postgres soon (to try
some more settings) so I dont want to leave it attached _too_ long.. ;)

.Guy

Re: Backend with closed connection at 99% CPU

From
Tom Lane
Date:
Guy Thornley <guy@esphion.com> writes:
> Postgres 7.4.1. (Yes I know, we _should_ upgrade).

Yup.

> As the subject says, it is spinning at 99% CPU. ...
> This backend has recently lost its client connection, ...
> You can even have a backtrace, how about that:

>         (gdb) bt
>         #0  0x0811cf40 in enlargeStringInfo ()
>         #1  0x081249b8 in pq_getmessage ()
>         #2  0x0817bdfe in HandleFunctionRequest ()
>         #3  0x0817bfda in HandleFunctionRequest ()
>         #4  0x0817eacc in PostgresMain ()

I'm betting this is this bug:

2004-05-11 16:07  tgl

    * src/backend/lib/stringinfo.c (REL7_4_STABLE): Add tests to
    enlargeStringInfo() to avoid possible buffer-overrun or
    infinite-loop problems if a bogus data length is passed.

Somehow the dying client injected a few bogus bytes into the
communication channel, and managed to trigger the infinite-loop
variant of this bug.

            regards, tom lane

Re: Backend with closed connection at 99% CPU

From
Shahbaz Javeed
Date:
Folks,

Here's the relevant discussion with Tom Lane's response.

http://archives.postgresql.org/pgsql-bugs/2004-09/msg00226.php

This bug is causing the load on our heavily used postgres server to go
into the 12-15 range.  Any idea whether this bug has been resolved?

Thanks

--
Shahbaz Javeed