Postgres server crash - Mailing list pgsql-performance

From Craig A. James
Subject Postgres server crash
Date
Msg-id 455BCAE8.8010905@modgraph-usa.com
Whole thread Raw
In response to Re: Best COPY Performance  ("Craig A. James" <cjames@modgraph-usa.com>)
Responses Re: Postgres server crash
Re: Postgres server crash
List pgsql-performance
For the third time today, our server has crashed, or frozen, actually something in between.  Normally there are about
30-50connections because of mod_perl processes that keep connections open.  After the crash, there are three processes
remaining:

# ps -ef | grep postgres
postgres 23832     1  0 Nov11 pts/1    00:02:53 /usr/local/pgsql/bin/postmaster -D /postgres/main
postgres  1200 23832 20 14:28 pts/1    00:58:14 postgres: pubchem pubchem 66.226.76.106(58882) SELECT
postgres  4190 23832 25 14:33 pts/1    01:09:12 postgres: asinex asinex 66.226.76.106(56298) SELECT

But they're not doing anything: No CPU time consumed, no I/O going on, no progress.  If I try to connect with psql(1),
itsays: 

   psql: FATAL:  the database system is in recovery mode

And the server log has:

LOG:  background writer process (PID 23874) was terminated by signal 9
LOG:  terminating any other active server processes
LOG:  statistics collector process (PID 23875) was terminated by signal 9
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because
anotherserver process exited ab 
normally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because
anotherserver process exited ab 
... repeats about 50 times, one per process.

Questions:
  1. Any idea what happened and how I can avoid this?  It's a *big* problem.
  2. Why didn't the database recover?  Why are there two processes
     that couldn't be killed?
  3. Where did the "signal 9" come from?  (Nobody but me ever logs
     in to the server machine.)

Help!

Thanks,
Craig


pgsql-performance by date:

Previous
From: "Steinar H. Gunderson"
Date:
Subject: Re: Hundreds of database and FSM
Next
From: Russell Smith
Date:
Subject: Re: Postgres server crash