Thread: server stopped running abnormally

server stopped running abnormally

From
Phil Frost
Date:
Things were crusing along just fine as I can tell, and then all the
postgresql processes terminated. This is all I see in the logs:

LOG:  autovacuum: processing database "dew"
LOG:  autovacuum: processing database "postgres"
LOG:  autovacuum: processing database "template1"
LOG:  autovacuum: processing database "dew"
LOG:  autovacuum: processing database "postgres"
LOG:  autovacuum: processing database "template1"
LOG:  autovacuum: processing database "dew"
LOG:  received immediate shutdown request
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because
anotherserver process exited abnormally and possibly corrupted shared memory. 
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because
anotherserver process exited abnormally and possibly corrupted shared memory. 
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
WARNING:  terminating connection because of crash of another server process
[repeated many times to end of log]

log output after restarting server:

LOG:  database system was interrupted at 2006-07-27 11:20:08 EDT
LOG:  checkpoint record is at 4/91F3A2CC
LOG:  redo record is at 4/91F3A2CC; undo record is at 0/0; shutdown FALSE
LOG:  next transaction ID: 520454; next OID: 1718140
LOG:  next MultiXactId: 3; next MultiXactOffset: 6
LOG:  database system was not properly shut down; automatic recovery in progress
LOG:  redo starts at 4/91F3A310
LOG:  record with zero length at 4/91F3A338
LOG:  redo done at 4/91F3A310
LOG:  database system is ready
LOG:  transaction ID wrap limit is 1074258945, limited by database "postgres"
LOG:  autovacuum: processing database "postgres"
LOG:  autovacuum: processing database "dew"
LOG:  autovacuum: processing database "postgres"
LOG:  autovacuum: processing database "dew"

I have a nightly cron job that runs "vacuumdb --all --full --analyze
--quiet" which failed a few days ago with this:

vacuumdb: vacuuming of database "dew" failed: ERROR:  buffer 126 is not owned by resource owner TopTransaction
PANIC:  cannot abort transaction 478973, it was already committed
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

I didn't do anything about it, and the job ran successfully the next
night. Some system information:

xserve:~ pfrost$ pg_config
BINDIR = /usr/local/pgsql/bin
DOCDIR = /usr/local/pgsql/doc
INCLUDEDIR = /usr/local/pgsql/include
PKGINCLUDEDIR = /usr/local/pgsql/include
INCLUDEDIR-SERVER = /usr/local/pgsql/include/server
LIBDIR = /usr/local/pgsql/lib
PKGLIBDIR = /usr/local/pgsql/lib
LOCALEDIR =
MANDIR = /usr/local/pgsql/man
SHAREDIR = /usr/local/pgsql/share
SYSCONFDIR = /usr/local/pgsql/etc
PGXS = /usr/local/pgsql/lib/pgxs/src/makefiles/pgxs.mk
CONFIGURE = '--with-python' '--with-openssl'
CC = gcc -no-cpp-precomp
CPPFLAGS =
CFLAGS = -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wdeclaration-after-statement -Wendif-labels
-fno-strict-aliasing
CFLAGS_SL =
LDFLAGS =
LDFLAGS_SL =
LIBS = -lpgport -lssl -lcrypto -lz -lreadline -lresolv -ldl -lm
VERSION = PostgreSQL 8.1.4
xserve:~ pfrost$ uname -a
Darwin xserve.****************.*** 8.6.0 Darwin Kernel Version 8.6.0: Tue Mar  7 16:58:48 PST 2006;
root:xnu-792.6.70.obj~1/RELEASE_PPCPower Macintosh powerpc 
xserve:~ pfrost$

Not sure what can be done about this one; I don't see anything odd. I
simply restarted the server about 15 minutes ago and things seem to be
running normally.

Re: server stopped running abnormally

From
Tom Lane
Date:
Phil Frost <indigo@bitglue.com> writes:
> Things were crusing along just fine as I can tell, and then all the
> postgresql processes terminated. This is all I see in the logs:

> LOG:  received immediate shutdown request

Something sent the postmaster a SIGQUIT signal.  You need to look into
what might have done that.

> I have a nightly cron job that runs "vacuumdb --all --full --analyze
> --quiet" which failed a few days ago with this:

> vacuumdb: vacuuming of database "dew" failed: ERROR:  buffer 126 is not owned by resource owner TopTransaction
> PANIC:  cannot abort transaction 478973, it was already committed

The PANIC is a known problem with VACUUM FULL: it commits its
transaction before it's really done processing, and so if it gets an
error after that point it's up the creek without a paddle.  But I'm not
sure what could have caused the "not owned by" error in the first place.
Don't suppose you can reproduce the problem?

            regards, tom lane