Re: production server down - Mailing list pgsql-hackers

From Joe Conway
Subject Re: production server down
Date
Msg-id 41C4C2E2.7090401@joeconway.com
Whole thread Raw
In response to Re: production server down  (Alvaro Herrera <alvherre@dcc.uchile.cl>)
Responses Re: production server down
List pgsql-hackers
Alvaro Herrera wrote:
> I can't help remembering the fact that the init script executes an
> initdb automatically if it finds an empty data directory (the ones I
> know of at least -- does the one you are running?).  Maybe what happened
> was that it found the empty mount point, executed an initdb, and then
> the NFS drive came online.  Later, the pg_control file was sync'ed to
> the "empty database" settings.  It'd be interesting to know if the
> mount point does have some files on it.

Good point! I'll take a look at the first opportunity.

> These values (from the corrupt pg_control file) are strange:
> 
>>pg_control last modified:             Tue Dec 14 15:39:26 2004
>>Time of latest checkpoint:            Tue Nov  2 17:05:32 2004
> 
> Maybe the latest checkpoint date has some interesting bit pattern that
> could explain it somehow.
> 

The last modified corresponds to just prior to the PANIC. See the logs:

2004-12-14 15:39:26 LOG:  received smart shutdown request
2004-12-14 15:39:26 LOG:  shutting down
2004-12-14 15:39:28 PANIC:  could not open file 
"/replica/pgdata/pg_xlog/0000000000000000" (log file 0, segment 0): No 
such file or directory

The Tue Nov  2 17:05:32 2004 seems to be related to the *previous* 
restart; from /var/log/messages:

8<----------------------------------
...
Nov  2 17:04:20 csdfds1 syslogd 1.4.1: restart.
...
Nov  2 17:05:22 csdfds1 su: pam_unix2: session started for user 
postgres, service su

...
Nov  2 17:05:33 csdfds1 su: (to postgres) root on /dev/pts/5
Nov  2 17:05:33 csdfds1 su: pam_unix2: session started for user 
postgres, service su
Nov  2 17:05:33 csdfds1 su: pam_unix2: session finished for user 
postgres, service su
...
8<----------------------------------

Can you make any sense out of that?

Joe


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: production server down
Next
From: Tom Lane
Date:
Subject: Re: production server down