Hi,
I recently mentioned to Robert (and also Heikki earlier), that I think I see a
way to detect an omitted backup_label in a relevant subset of the cases (it'd
apply to the pg_control as well, if we moved to that). Robert encouraged me
to share the idea, even though it does not provide complete protection.
The subset I think we can address is the following:
a) An omitted backup_label would lead to corruption, i.e. without the
backup_label we won't start recovery at the right position. Obviously it'd
be better to also catch a wrong procedure when it'd not cause corruption -
perhaps my idea can be extended to handle that, with a small bit of
overhead.
b) The backup has been taken from a primary. Unfortunately that probably can't
be addressed - but the vast majority of backups are taken from a primary,
so I think it's still a worthwhile protection.
Here's my approach
1) We add a XLOG_BACKUP_START WAL record when starting a base backup on a
primary, emitted just *after* the checkpoint completed
2) When replaying a base backup start record, we create a state file that
includes the corresponding LSN in the filename
3) On the primary, the state file for XLOG_BACKUP_START is *not* created at
that time. Instead the state file is created during pg_backup_stop().
4) When replaying a XLOG_BACKUP_END record, we verif that the state file
created by XLOG_BACKUP_START is present, and error out if not. Backups
that started before the redo LSN from backup_label are ignored
(necessitates remembering that LSN, but we've been discussing that anyway).
Because the backup state file on the primary is only created during
pg_backup_stop(), a copy of the data directory taken between pg_backup_start()
and pg_backup_stop() does *not* contain the corresponding "backup state
file". Because of this, an omitted backup_label is detected if recovery does
not start early enough - recovery won't encounter the XLOG_BACKUP_START record
and thus would not create the state file, leading to an error in 4).
It is not a problem that the primary does not create the state file before the
pg_backup_stop() - if the primary crashes before pg_backup_stop(), there is no
XLOG_BACKUP_END and thus no error will be raised. It's a bit odd that the
sequence differs between normal processing and recovery, but I think that's
nothing a good comment couldn't explain.
I haven't worked out the details, but I think we might be able extend this to
catch errors even if there is no checkpoint during the base backup, by
emitting the WAL record *before* the RequestCheckpoint(), and creating the
corresponding state file during backup_label processing at the start of
recovery. That'd probably make the logic for when we can remove the backup
state files a bit more complicated, but I think we could deal with that.
Comments? Swear words?
Greetings,
Andres Freund