* Dan Moschuk <dan@freebsd.org> [001012 09:47] wrote:
>
> Sparc solaris 2.7 with postgres 7.0.2
>
> It seems to be reproducable, the server crashes on us at a rate of about
> every few hours.
>
> Any ideas?
>
> GNU gdb 4.17
> Copyright 1998 Free Software Foundation, Inc.
[snip]
> #78 0x1dd210 in elog (lev=0,
> fmt=0x21a9b0 "Message from PostgreSQL backend:\n\tThe Postmaster has informed me that some other backend died
abnormallyand possibly corrupted shared memory.\n\tI have rolled back the current transaction and am going "...)
> at elog.c:312
> #79 0x1636f8 in quickdie (postgres_signal_arg=16) at postgres.c:713
> #80 <signal handler called>
> #81 0xff195dd4 in _poll ()
> #82 0xff14e79c in select ()
> #83 0x14df58 in s_lock_sleep (spin=18) at s_lock.c:62
> #84 0x14dfa0 in s_lock (lock=0xff270011 "ÿ", file=0x2197c8 "spin.c", line=127)
> at s_lock.c:76
> #85 0x154620 in SpinAcquire (lockid=0) at spin.c:127
> #86 0x149100 in ReadBufferWithBufferLock (reln=0x2ce4e8, blockNum=4323,
> bufferLockHeld=1 '\001') at bufmgr.c:297
% uname -sr
SunOS 5.7
from sys/signal.h:
#define SIGUSR1 16 /* user defined signal 1 */
Are you sure you don't have any application running amok sending
signals to processes it shouldn't? Getting a superfolous signal
seems out of place, this doesn't look like a crash or anything
because USR1 isn't delivered by the kernel afaik.
And why are you using solaris? *smack*
Any why isn't postmaster either blocking these signals or shutting
down cleanly on reciept of them?
--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."