Re: Backend crashes in 7.0.3 - Mailing list pgsql-bugs

From Dirk Lutzebaeck
Subject Re: Backend crashes in 7.0.3
Date
Msg-id 14915.38813.383570.86812@ampato.core.aeccom.com
Whole thread Raw
In response to Re: Backend crashes in 7.0.3  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Backend crashes in 7.0.3  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Tom Lane writes:
 > Dirk Lutzebaeck <lutzeb@aeccom.com> writes:
 > > I observe occasionaly crashes on 7.0.3 under medium load:
 >
 > > Backend message type 0x49 arrived while idle
 > > Backend message type 0x44 arrived while idle
 > > Backend message type 0x54 arrived while idle
 >
 > > I recently upgraded from 7.0.2 to 7.0.3 on RH6.0, Linux 2.2.10 and I
 > > haven't observed these messages before. I have
 > > compiled the source on my own  (egcs 2.91.66).
 >
 > You can, but in the long run it'd be more useful to figure out what's
 > going wrong.  The above is not much info --- what are you doing when
 > this happens, and what if anything appears in the postmaster log?


It may be that there is some kernel corruption appearing here. I'm
using kernel nfs on Linux 2.2.10 with a Solaris8 i86pc client. I saw
some weird NFS error messages on the Linux system which are related to
the solaris client. I suspect the kernel nfs daemon corrupting memory
areas where postgres shared mem resides. I'm currently trying to dig more into
the problem. Could this be possible? Strange is that stopping and
restarting the postmaster does not help. The crashes occur again. When
killing the children some still stay alive. Giving them a SIGTERM
again leaves them in a constant running state (R). strace -p to the
child is just quiet. I can only kill the child then with SIGKILL.
I haven't started the postmaster with debug on yet. I have now shut
off the solaris client and restarted the machine. Currently it looks
fine.

Dirk

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: Backend crashes in 7.0.3
Next
From: Tom Lane
Date:
Subject: Re: Backend crashes in 7.0.3