Too strict check when starting from a basebackup taken off a standby - Mailing list pgsql-hackers

From Andres Freund
Subject Too strict check when starting from a basebackup taken off a standby
Date
Msg-id 20141211034501.GA8139@alap3.anarazel.de
Whole thread Raw
Responses Re: Too strict check when starting from a basebackup taken off a standby  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
Hi,

A customer recently reported getting "backup_label contains data
inconsistent with control file" after taking a basebackup from a standby
and starting it with a typo in primary_conninfo.

When starting postgres from a basebackup StartupXLOG() has the follow
code to deal with backup labels:       if (haveBackupLabel)       {           ControlFile->backupStartPoint =
checkPoint.redo;          ControlFile->backupEndRequired = backupEndRequired;
 
           if (backupFromStandby)           {               if (dbstate_at_startup != DB_IN_ARCHIVE_RECOVERY)
       ereport(FATAL,                           (errmsg("backup_label contains data inconsistent with control file"),
                        errhint("This means that the backup is corrupted and you will "
"haveto use another backup for recovery.")));               ControlFile->backupEndPoint =
ControlFile->minRecoveryPoint;          }       }
 

while I'm not enthusiastic about the error message, that bit of code
looks sane at first glance. We certainly expect the control file to
indicate we're in recovery. Since we're unlinking the backup label
shortly afterwards we'd normally not expect to hit that case after a
shutdown in recovery.

The problem is that after reading the backup label we also have to read
the corresponding checkpoing from pg_xlog. If primary_conninfo and/or
restore_command are misconfigured and can't restore files that can only
be fixed by shutting down the cluster and fixing up recovery.conf -
which sets DB_SHUTDOWNED_IN_RECOVERY in the control file.

The easiest solution seems to be to simply also allow that as a state in
the above check. It might be nicer to not allow a ShutdownXLOG to modify
the control file et al at that stage, but I think that'd end up being
more invasive.

A short search shows that that also looks like a credible explanation
for #12128...

I plan to relax that check unless somebody comes up with a different &
better plan.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: inherit support for foreign tables
Next
From: Michael Paquier
Date:
Subject: Re: WRITE_UINT_FIELD used where WRITE_OID_FIELD likely intended