Re: Defunct postmasters - Mailing list pgsql-general

From Gavin Scott
Subject Re: Defunct postmasters
Date
Msg-id 1014916567.1465.59.camel@gavin.pokerpages.com
Whole thread Raw
In response to Re: Defunct postmasters  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Defunct postmasters  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Defunct postmasters  (Holger Marzen <holger@marzen.de>)
List pgsql-general
On Mon, 2002-02-25 at 16:59, Tom Lane wrote:
> Gavin Scott <gavin@pokerpages.com> writes:
> > We have lately begun having problems with our production database
> > running postgres 7.1 on linux kernel v 2.4.17.  The system had run
> > without incident for many months (there were occasional reboots).  Since
> > we upgraded to kernel 2.4.17 on Dec. 31 it ran non-stop without problem
> > until Feb 13, when postmaster appeared to stop taking new incoming
> > connections. We restarted and then the problem struck again Saturday
> > night (Feb 23).
>
> If it happens again, could you attach to the postmaster with gdb and get
> a stack trace from it?

Just happened again this morning.  It turns out this looks like an
openssl problem, not a postgresql one:

[root@me2 gavin]# gdb /usr/bin/postmaster 736
GNU gdb 5.0
Copyright 2000 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "i386-redhat-linux"...
(no debugging symbols found)...
/big/home/gavin/736: No such file or directory.
Attaching to program: /usr/bin/postmaster, Pid 736
Reading symbols from /usr/lib/libssl.so.0...(no debugging symbols
found)...
done.
Loaded symbols for /usr/lib/libssl.so.0
Reading symbols from /usr/lib/libcrypto.so.0...(no debugging symbols
found)...
done.
Loaded symbols for /usr/lib/libcrypto.so.0
Reading symbols from /usr/kerberos/lib/libkrb5.so.3...
(no debugging symbols found)...done.
Loaded symbols for /usr/kerberos/lib/libkrb5.so.3
Reading symbols from /usr/kerberos/lib/libk5crypto.so.3...
(no debugging symbols found)...done.
Loaded symbols for /usr/kerberos/lib/libk5crypto.so.3
Reading symbols from /usr/kerberos/lib/libcom_err.so.3...
(no debugging symbols found)...done.
Loaded symbols for /usr/kerberos/lib/libcom_err.so.3
Reading symbols from /usr/lib/libz.so.1...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /usr/lib/libreadline.so.4.1...done.
Loaded symbols for /usr/lib/libreadline.so.4.1
Reading symbols from /lib/libtermcap.so.2...done.
Loaded symbols for /lib/libtermcap.so.2
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
0x40310424 in __libc_read () from /lib/libc.so.6
(gdb) where
#0  0x40310424 in __libc_read () from /lib/libc.so.6
#1  0x400f8e54 in __DTOR_END__ () from /usr/lib/libcrypto.so.0
#2  0x40090f53 in BIO_read () from /usr/lib/libcrypto.so.0
#3  0x4002dada in ssl3_read_n () from /usr/lib/libssl.so.0
#4  0x4002dbca in ssl3_get_record () from /usr/lib/libssl.so.0
#5  0x4002e633 in ssl3_read_bytes () from /usr/lib/libssl.so.0
#6  0x4002f517 in ssl3_get_message () from /usr/lib/libssl.so.0
#7  0x40027011 in ssl3_check_client_hello () from /usr/lib/libssl.so.0
#8  0x40026c19 in ssl3_accept () from /usr/lib/libssl.so.0
#9  0x40033842 in SSL_accept () from /usr/lib/libssl.so.0
#10 0x4003056a in ssl23_get_client_hello () from /usr/lib/libssl.so.0
#11 0x4002fd57 in ssl23_accept () from /usr/lib/libssl.so.0
#12 0x40033842 in SSL_accept () from /usr/lib/libssl.so.0
#13 0x80f136d in PostmasterMain ()
#14 0x80cf372 in PacketReceiveFragment ()
#15 0x80f10b1 in PostmasterMain ()
#16 0x80f0ac4 in PostmasterMain ()
#17 0x80cf7e8 in main ()
#18 0x40258f31 in __libc_start_main (main=0x80cf680 <main>, argc=5,
    ubp_av=0xbffffab4, init=0x8065e04 <_init>, fini=0x8154d60 <_fini>,
    rtld_fini=0x4000e274 <_dl_fini>, stack_end=0xbffffaac)
    at ../sysdeps/generic/libc-start.c:129

It did indeed seem to be initiated by some odd networking problems in
the path to the machine that makes SSL postgres connections to the
postgres server.

We were running openssl 0.9.5 on that machine.  I've upgraded it to
0.9.6 for now and am going to start looking to see if this was a known
bug of 0.9.5.

Thanks for all your help!

Gavin Scott
gavin@pokerpages.com



pgsql-general by date:

Previous
From: tony
Date:
Subject: Re: dates and encoding
Next
From: Tom Lane
Date:
Subject: Re: Defunct postmasters