UTC4115FATAL: the database system is in recovery mode - Mailing list pgsql-bugs

From Mathew Samuel
Subject UTC4115FATAL: the database system is in recovery mode
Date
Msg-id FBB3A126CE548B48A5CD4C867A3325860369515545@sottexch7.corp.ad.entrust.com
Whole thread Raw
Responses Re: UTC4115FATAL: the database system is in recovery mode
Re: UTC4115FATAL: the database system is in recovery mode
List pgsql-bugs
Hi,

I see the following error as found in pg.log:
UTC4115FATAL:  the database system is in recovery mode

Actually that message was logged repeatedly for about 4 hours according to =
the logs (I don't have access to the system itself, just the logs).

Leading up to that error were the following in pg.log:
2011-03-28 10:44:06 UTC3609LOG:  checkpoints are occurring too frequently (=
11 seconds apart)
2011-03-28 10:44:06 UTC3609HINT:  Consider increasing the configuration par=
ameter "checkpoint_segments".
2011-03-28 10:44:18 UTC3609LOG:  checkpoints are occurring too frequently (=
12 seconds apart)
2011-03-28 10:44:18 UTC3609HINT:  Consider increasing the configuration par=
ameter "checkpoint_segments".
2011-03-28 10:44:28 UTC3609LOG:  checkpoints are occurring too frequently (=
10 seconds apart)
2011-03-28 10:44:28 UTC3609HINT:  Consider increasing the configuration par=
ameter "checkpoint_segments".
2011-03-28 10:44:38 UTC3609LOG:  checkpoints are occurring too frequently (=
10 seconds apart)
2011-03-28 10:44:38 UTC3609HINT:  Consider increasing the configuration par=
ameter "checkpoint_segments".
2011-03-28 10:44:42 UTC3932ERROR:  canceling statement due to statement tim=
eout
2011-03-28 10:44:42 UTC3932STATEMENT:  vacuum full analyze _zamboni.sl_log_1
2011-03-28 10:44:42 UTC3932PANIC:  cannot abort transaction 1827110275, it =
was already committed
2011-03-28 10:44:42 UTC3566LOG:  server process (PID 3932) was terminated b=
y signal 6
2011-03-28 10:44:42 UTC3566LOG:  terminating any other active server proces=
ses
2011-03-28 10:44:42 UTC13142WARNING:  terminating connection because of cra=
sh of another server process
2011-03-28 10:44:42 UTC13142DETAIL:  The postmaster has commanded this serv=
er process to roll back the current transaction and exit, because another s=
erver process exited abnormally and possibly corrupted shared memory.
2011-03-28 10:44:42 UTC13142HINT:  In a moment you should be able to reconn=
ect to the database and repeat your command.
2011-03-28 10:44:42 UTC29834WARNING:  terminating connection because of cra=
sh of another server process
2011-03-28 10:44:42 UTC29834DETAIL:  The postmaster has commanded this serv=
er process to roll back the current transaction and exit, because another s=
erver process exited abnormally and possibly corrupted shared memory.
2011-03-28 10:44:42 UTC29834HINT:  In a moment you should be able to reconn=
ect to the database and repeat your command.
2011-03-28 10:44:42 UTC3553WARNING:  terminating connection because of cras=
h of another server process
2011-03-28 10:44:42 UTC3553DETAIL:  The postmaster has commanded this serve=
r process to roll back the current transaction and exit, because another se=
rver process exited abnormally and possibly corrupted shared memory.
In fact those last 3 lines are repeated over and over again repeatedly unti=
l "UTC4115FATAL:  the database system is in recovery mode" is logged for 4 =
hours. At some point, 4 hours later of course, it appears that the system r=
ecovers.

The Checkpoints Settings in postgresql.conf are commented out so I guess th=
e defaults are being used:
#checkpoint_segments =3D 3                # in logfile segments, min 1, 16M=
B each
#checkpoint_timeout =3D 5min              # range 30s-1h
#checkpoint_warning =3D 30s               # 0 is off
That system where this was seen was using pgsql-8.2.6 on RHEL4.

Not sure if this is a known bug (or if it is a bug at all or something I ca=
n address using different configuration) but I thought I would post here fi=
rst if any one might be familiar with this issue and suggest a possible sol=
ution. Any ideas?

Cheers,
Matt

pgsql-bugs by date:

Previous
From: Alex Hunsaker
Date:
Subject: Re: 9.1 plperlu bug with null rows in trigger hash
Next
From: "Christopher Dillard"
Date:
Subject: BUG #6044: Access violation on XML decl with standalone