On 2023-07-16 6:27 p.m., Michael Paquier wrote:
>
> Delete a backup_label from a fresh base backup can easily lead to data
> corruption, as the startup process would pick up as LSN to start
> recovery from the control file rather than the backup_label file.
> This would happen if a checkpoint updates the redo LSN in the control
> file while a backup happens and the control file is copied after the
> checkpoint, for instance. If one wishes to deploy a new primary from
> a base backup, recovery.signal is the way to go, making sure that the
> new primary is bumped into a new timeline once recovery finishes, on
> top of making sure that the startup process starts recovery from a
> position where the cluster would be able to achieve a consistent
> state.
Thanks a lot for sharing this information.
>
> How would you rewrite that? I am not sure how many details we want to
> put here in terms of differences between recovery.signal and
> standby.signal, still we surely should mention these are the two
> possible choices.
Honestly, I can't convince myself to mention the backup_label here too.
But, I can share some information regarding my testing of the patch and
the corresponding results.
To assess the impact of the patch, I executed the following commands for
before and after,
pg_basebackup -h localhost -p 5432 -U david -D pg_backup1
pg_ctl -D pg_backup1 -l /tmp/logfile start
Before the patch, there were no issues encountered when starting an
independent Primary server.
However, after applying the patch, I observed the following behavior
when starting from the base backup:
1) simply start server from a base backup
FATAL: could not find recovery.signal or standby.signal when recovering
with backup_label
HINT: If you are restoring from a backup, touch
"/media/david/disk1/pg_backup1/recovery.signal" or
"/media/david/disk1/pg_backup1/standby.signal" and add required recovery
options.
2) touch a recovery.signal file and then try to start the server, the
following error was encountered:
FATAL: must specify restore_command when standby mode is not enabled
3) touch a standby.signal file, then the server successfully started,
however, it operates in standby mode, whereas the intended behavior was
for it to function as a primary server.
Best regards,
David