Re: recovery getting interrupted is not so unusual as it used to be - Mailing list pgsql-hackers

From Robert Haas
Subject Re: recovery getting interrupted is not so unusual as it used to be
Date
Msg-id AANLkTimB7BhZHLDnmohja7qnIg54NayGh897ilIY1kDn@mail.gmail.com
Whole thread Raw
In response to Re: recovery getting interrupted is not so unusual as it used to be  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: recovery getting interrupted is not so unusual as it used to be  (Florian Pflug <fgp@phlo.org>)
List pgsql-hackers
On Wed, Jun 2, 2010 at 5:39 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> On 02/06/10 23:50, Robert Haas wrote:
>>
>> First, is it appropriate to set the control file state to
>> DB_SHUTDOWNED_IN_RECOVERY even when we're in crash recovery (as
>> opposed to archive recovery/SR)?  My vote is no, but Heikki thought it
>> might be OK.
>
> My logic on that is:
>
> If the database is known to be in good shape, i.e not corrupt, after
> shutdown during crash recovery, then we should not print the warning at
> restart saying "This probably means that some data is corrupted". There's no
> reason to believe the database is corrupt if it's a controlled shutdown, so
> setting control file state to DB_SHUTDOWNED_IN_RECOVERY is OK. But if it's
> not OK for some reason, then we really shouldn't allow the shut down in the
> first place until we hit the end of WAL.
>
> So the option "allow shutdown, but warn at restart that your data is
> probably corrupt" does not make sense in any case.

Well, the point is, we emit that message every time we go to recover
from a crash.  Presumably the message is as valid after a restart of
crash recovery as it was the first time around.

<thinks>

But maybe the message isn't right the first time either.  After all
the point of having a write-ahead log in the first place is that we
should be able to prevent corruption in the event of an unexpected
shutdown.  Maybe the right thing to do is to forget about adding a new
state and just remove or change the errhint from these messages:

ereport(LOG, (errmsg("database system was interrupted while in
recovery at %s", str_time(ControlFile->time)),                       errhint("This probably means that some data is
corrupted and"                                       " you will have to use the
last backup for recovery.")));

ereport(LOG, (errmsg("database system was interrupted while in
recovery at log time %s", str_time(ControlFile->checkPointCopy.time)),                      errhint("If this has
occurredmore than once 
some data might be corrupted"                         " and you might need to choose an earlier
recovery target.")));

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: recovery getting interrupted is not so unusual as it used to be
Next
From: Greg Stark
Date:
Subject: Re: Keepalive for max_standby_delay