Thread: Identifying cause of "database system shutdown was interrupted" at failed startup

Identifying cause of "database system shutdown was interrupted" at failed startup

From
"Crispin Miller"
Date:
Hi,
    We recently encountered a serious database crash that resulted
in a significant loss of data...=20

    We took down the database server, and when we restarted the
backend we got an error 'database system shutdown was interrupted' ...
'invalid checkpoint' etc... with missing xlog files (I've appended the
log to the end of this post)...

    I've been trawling list-archives for a few days and this issue
has cropped up a number of times, but I've found it hard to identify a
single post - or set of posts - that might help explain the cause of
such a crash...

    Hopefully I'll be able to bring together the results of this
trawl through the archives in this post - but I'd really appreciate any
help or suggestions people have - we currently have a slightly uneasy
feeling because we've not quite got to the bottom of the issues, and it
would be nice to set our minds at rest! :-)

    So far I've identified two possible causes of the crash - I've
listed them below, and wonder whether people have any comments on them:

    1) We were running postgres version 7.3.6-1 (which is the
version in RedHat AS3 : redhat EL AS3 kernel-smp-2.4.21-9.0.1EL)
    The following post suggests that this is a known issue in 7.3.3,
but 7.3.4 is safe? I assume, therefore, that 7.3.6-1 is also safe...
=09
http://archives.postgresql.org/pgsql-general/2003-09/msg01086.php
=09=20
    2) We are running the database in conjunction with Jboss,
connecting to the database server from a different machine via JDBC. The
database was taken down *without* stopping Jboss first.=20

    Any thoughts would be much apreciated!

    Below are the relevant bits of the shutdown and startup logs,

    Best wishes,
    Crispin

    ----------------------
    shutdown log (/var/log/messages):=20
    May 28 15:43:35  shutdown: shutting down for system halt
    May 28 15:43:35  init: Switching to runlevel: 0
    May 28 15:43:36 server rhnsd[1694]: Exiting
    May 28 15:43:36 server rhnsd: rhnsd shutdown succeeded
    May 28 15:43:36 server atd: atd shutdown succeeded
    May 28 15:43:36 server cups: cupsd shutdown succeeded
    May 28 15:43:36 server xfs[1643]: terminating=20
    May 28 15:43:36 server xfs: xfs shutdown succeeded
    May 28 15:43:36 server mysqld: Stopping MySQL: succeeded
    May 28 15:43:36 server gpm: gpm shutdown succeeded
    May 28 15:43:37 server rhdb: Stopping PostgreSQL - Red Hat
Edition service:=20
    May 28 15:43:37 server su(pam_unix)[12400]: session opened for
user postgres by (uid=3D0)
    May 28 15:43:40 server su(pam_unix)[12400]: session closed for
user postgres
    May 28 15:43:40 server rhdb: ^[[60G[=20
    May 28 15:43:40 server rhdb:=20
    May 28 15:43:40 server rc: Stopping rhdb: succeeded=20
    ...=20
    May 28 15:43:44 server kernel: Kernel logging (proc) stopped.
    May 28 15:43:44 server kernel: Kernel log daemon terminating.
    May 28 15:43:45 server syslog: klogd shutdown succeeded
    May 28 15:43:45 server exiting on signal 15
    May 28 16:13:35 server syslogd 1.4.1: restart.


    -----
    starting messages

    Jun  1 10:43:55 server postgres[5537]: [30] LOG:  database
system shutdown was interrupted at 2004-05-28 16:32:08 BST
    Jun  1 10:43:55 server postgres[5537]: [31] LOG:  open of
/var/lib/pgsql/data/pg_xlog/0000000000000000 (log file 0, segment 0)
failed: No such file or directory
    Jun  1 10:43:55 server postgres[5537]: [32] LOG:  invalid
primary checkpoint record
    Jun  1 10:43:55 server postgres[5537]: [33] LOG:  open of
/var/lib/pgsql/data/pg_xlog/0000000000000000 (log file 0, segment 0)
failed: No such file or directory
    Jun  1 10:43:55 server postgres[5537]: [34] LOG:  invalid
secondary checkpoint record
    Jun  1 10:43:55 server postgres[5537]: [35] PANIC:  unable to
locate a valid checkpoint record
    Jun  1 10:43:55 server postgres[5534]: [31] LOG:  startup
process (pid 5537) was terminated by signal 6
    Jun  1 10:43:55 server postgres[5534]: [32] LOG:  aborting
startup due to startup process failure
    Jun  1 10:43:56 server rhdb: Starting PostgreSQL - Red Hat
Edition service:  failed
    Jun  1 10:44:00 server su(pam_unix)[5554]: session opened for
user postgres by (uid=3D0)
    Jun  1 10:44:00 server su(pam_unix)[5554]: session closed for
user postgres
    Jun  1 10:44:00 server postgres[5595]: [30] LOG:  database
system shutdown was interrupted at 2004-05-28 16:32:08 BST
    Jun  1 10:44:00 server postgres[5595]: [31] LOG:  open of
/var/lib/pgsql/data/pg_xlog/0000000000000000 (log file 0, segment 0)
failed: No such file or directory
    Jun  1 10:44:00 server postgres[5595]: [32] LOG:  invalid
primary checkpoint record
    Jun  1 10:44:00 server postgres[5595]: [33] LOG:  open of
/var/lib/pgsql/data/pg_xlog/0000000000000000 (log file 0, segment 0)
failed: No such file or directory
    Jun  1 10:44:00 server postgres[5595]: [34] LOG:  invalid
secondary checkpoint record
    Jun  1 10:44:00 server postgres[5595]: [35] PANIC:  unable to
locate a valid checkpoint record
    Jun  1 10:44:00 server postgres[5592]: [31] LOG:  startup
process (pid 5595) was terminated by signal 6
    Jun  1 10:44:00 server postgres[5592]: [32] LOG:  aborting
startup due to startup process failure
    Jun  1 10:44:01 server rhdb: Starting PostgreSQL - Red Hat
Edition service:  failed
=20
--------------------------------------------------------

=20
This email is confidential and intended solely for the use of the person(s)=
 ('the intended recipient') to whom it was addressed. Any views or opinions=
 presented are solely those of the author and do not necessarily represent =
those of the Paterson Institute for Cancer Research or the Christie Hospita=
l NHS Trust. It may contain information that is privileged & confidential w=
ithin the meaning of applicable law. Accordingly any dissemination, distrib=
ution, copying, or other use of this message, or any of its contents, by an=
y person other than the intended recipient may constitute a breach of civil=
 or criminal law and is strictly prohibited. If you are NOT the intended re=
cipient please contact the sender and dispose of this e-mail as soon as pos=
sible.
=20