On 7/1/20 5:44 PM, Magnus Hagander wrote:
> On Wed, Jul 1, 2020 at 11:08 PM David Steele <david@pgmasters.net
> <mailto:david@pgmasters.net>> wrote:
>
> But yeah, it would be possible to kill somebody else's session with
> some
> finagling. Still, worse case would be an error'd backup rather than a
> corrupt one.
>
> What about the case of:
> Session A - start backup
> Session B - stop backup (but A is still running of course)
> Session C - start backup
> Session A - stop backup
>
> At this point, session A can still stop the backup because there is one
> running -- but there has been time in between the two when no backup was
> running. That could lead to Session A getting a corrupt backup, I think
> -- unless we pass some unique identifier back in pg_stop_backup that
> matches it up. (And if we do pass that up, then session B running
> pg_stop_backup() would fail, thus leaving the backup started by A still
> running.
This is fine because the min start LSN would have been advanced after B
stopped. When A tries to stop the min start LSN will be later than its
start LSN so it will error.
It might be easier/better to just keep the one exclusive slot in shared
memory and store the backup label in it. We only allow one exclusive
backup now so it wouldn't be a loss in functionality.
None of this really solves the problem of what happens when the user
dumps the backup_label into the data directory. With traditional backup
software that's pretty much going to be the only choice. Is telling them
not to do it and washing our hands of it really enough?
In particular, I'm worried about the logic in postmaster.c that would be
removed if we no longer save the backup_label explicitly during an
exclusive backup. If backup_label is no longer removed on a clean
shutdown it seems we'll just make the situation worse.
Regards,
--
-David
david@pgmasters.net