Re: PostgreSQL 9.1 "database system was interrupted; last known up at" - Mailing list pgsql-admin

From Albe Laurenz
Subject Re: PostgreSQL 9.1 "database system was interrupted; last known up at"
Date
Msg-id A737B7A37273E048B164557ADEF4A58B36615F34@ntex2010i.host.magwien.gv.at
Whole thread Raw
In response to PostgreSQL 9.1 "database system was interrupted; last known up at"  ("Boylan, Ross" <Ross.Boylan@ucsf.edu>)
Responses Re: PostgreSQL 9.1 "database system was interrupted; last known up at"
List pgsql-admin
Ross Boylan wrote:
> I had to power cycle my system because it became unresponsive.  Now PosgtreSQL will not start.  I
> would like advice about how to proceed; I think pg_resetxlog is my next step.  I have made a copy of
> the current database files.
> 
> <log file="postgresql-9.1-main.log">
> 2015-05-25 10:44:21 PDT LOG:  database system was interrupted; last known up at 2015-05-22 09:22:25 PDT
> 2015-05-25 10:44:21 PDT LOG:  incomplete startup packet
> 2015-05-25 10:44:21 PDT FATAL:  could not open file "/etc/ssl/certs/ssl-cert-snakeoil.pem": Permission denied
> 2015-05-25 10:44:21 PDT LOG:  startup process (PID 5180) exited with exit code 1
> 2015-05-25 10:44:21 PDT LOG:  aborting startup due to startup process failure
> </log>
> 
> I am running PostgreSQL 9.1 on Debian wheezy aka 7 aka oldstable.
> Installed via the Debian package.  I think I accepted the defaults, and have not changed the
> configuration since.
> Linux  3.2.0, stock Debian kernel, amd64.
> Connect via emacs sql-postgresql or psql.  An init script controls startup.
> 
> When the system became unresponsive I was able to ssh in; the X process had gone crazy and could not
> be killed.  Most key file systems had been remounted read-only, and many commands (includiing shutdown
> and telinit) produced errors, often I/O errors, when run.  The last kern.log entries showed a process
> being killed.  There was quite a lot of inode deletion and log replaying on restart.  The log message
> above was from just after the restart.
> 
> I have no backups*, but could recreate the database in the worst case.  I haven't done anything with
> the database in at least a week, I think, and so if I could get back the state as of 5/22 that would
> be fine.
> 
> Filesystem is ext3 on dm-crypt on LVM.
> 
> The permission error on the snakeoil cert is weird, since it is readable by all.  I'm guessing it's a
> side effect of the earlier problems.

You didn't say which exact version of PostgreSQL you are running, but I bet you it is 9.1.16
and you are hitting the "Fsync Permissions Bug" introduced with that release:
https://wiki.postgresql.org/wiki/May_2015_Fsync_Permissions_Bug

There are plans to release a fix for that shortly.
A better workaround than the one specified in the current version of the Wiki page
might be to replace the symbolic link with a copy and change ownership of the files to "postgres".

A restore wouldn't help you in this case, but since you probably felt the pain,
please implement a backup strategy.  It may come handy some day if recreating the database
is not a nice option.

Yours,
Laurenz Albe

pgsql-admin by date:

Previous
From: "Boylan, Ross"
Date:
Subject: Re: PostgreSQL 9.1 "database system was interrupted; last known up at"
Next
From: Thomas SIMON
Date:
Subject: Re: Performances issues with SSD volume ?