Re: The danger of deleting backup_label - Mailing list pgsql-hackers

From David Steele
Subject Re: The danger of deleting backup_label
Date
Msg-id 17f049f5-2108-0902-7ce1-5253996e1215@pgmasters.net
Whole thread Raw
In response to Re: The danger of deleting backup_label  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: The danger of deleting backup_label
Re: The danger of deleting backup_label
List pgsql-hackers
On 10/18/23 08:39, Robert Haas wrote:
> On Tue, Oct 17, 2023 at 4:17 PM David Steele <david@pgmasters.net> wrote:
>> Given that the above can't be back patched, I'm thinking we don't need
>> backup_label at all going forward. We just write the values we need for
>> recovery into pg_control and return *that* from pg_backup_stop() and
>> tell the user to store it with their backup. We already have "These
>> files are vital to the backup working and must be written byte for byte
>> without modification, which may require opening the file in binary
>> mode." in the documentation so dealing with pg_control should not be a
>> problem. pg_control also has a CRC so we will know if it gets munged.
> 
> Yeah, I was thinking about this kind of idea, too. I think it might be
> a good idea, although I'm not completely certain about that, either.

<snip>

> First, anything that is stored in the backup_label but not the control
> file has to (a) move into the control file, 

I'd rather avoid this.

> (b) be stored someplace
> else, 

I don't think the additional fields *need* to be stored anywhere at all, 
at least not by us. We can provide them as output from pg_backup_stop() 
and the caller can do as they please. None of those fields are part of 
the restore process.

> or (c) be eliminated as a concept. 

I think we should keep them as above since I don't believe they hurt 
anything.

> We're likely to get
> complaints about (a), especially if the data in question is anything
> big. Any proposal to do (b) risks undermining the whole theory under
> which this is a good proposal, namely that removing backup_label gives
> us one less thing to worry about. So that brings us to (c).
> Personally, I would lose very little sleep if the LABEL field died and
> never came back, and I wouldn't miss START TIME and STOP TIME either,
> but somebody else might well feel differently. I don't think it's
> trivial to get rid of BACKUP METHOD, as there unfortunately seems to
> be code that depends on knowing the difference between BACKUP FROM:
> streamed and BACKUP FROM: pg_rewind. I suspect that BACKUP FROM:
> primary/standby might have the same issue, but I'm not sure. 

BACKUP FROM: primary/standby we can definitely handle, and BACKUP 
METHOD: pg_rewind just means backupEndRequired is false. That will need 
to be written by pg_rewind (instead of writing backup_label).

> STOP
> TIMELINE could be a problem too. I think that if somebody could do
> some rejiggering to eliminate some of the differences between the
> cases here, that could be really good general cleanup irrespective of
> what we decide about this proposal, and moving some things in to
> pg_control is probably reasonable too. For instance, it would seem
> crazy to me to argue that storing the backup end location in the
> control file is OK, but storing the backup end TLI there would not be
> OK. 

We have a place in pg_control for the end TLI (minRecoveryPointTLI). 
That's only valid for backup from standby since a primary backup can 
never change timelines.

> But the point here is that there's probably a good deal of careful
> thinking that would need to be done here about exactly where all of
> the stuff that currently exists in the backup_label file but not in
> pg_control needs to end up.

Agreed.

> Second, right now, the stuff that we return at the end of a backup is
> all text data. With this proposal, it becomes binary data. I entirely
> realize that people should only be doing these kinds of backups using
> automated tools that that those automated tools should be perfectly
> capable of handling binary data without garbling anything. But that's
> about as realistic as supposing that people won't instantly remove the
> backup_label file the moment it seems like it will solve some problem,
> even when the directions clearly state that this should only be done
> in some other situation that is not the one the user is facing. 

Well, we do specify that backup_label and tablespace_map should be saved 
byte for byte. But We've already seen users mess this up in the past and 
add \r characters that made the files unreadable.

If it can be done wrong, it will be done wrong by somebody.

> It
> just occurred to me that one thing we could do to improve the user
> experience here is offer some kind of command-line utility to assist
> with taking a low-level backup. This could be done even if we don't
> proceed with this proposal e.g.
> 
> pg_llbackup -d $CONNTR --backup-label=PATH --tablespace-map=PATH
> --copy-data-directory=SHELLCOMMAND
> 
> I don't know for sure how much that would help, but I wonder if it
> might actually help quite a bit, because right now people do things
> like use psql in a shell script to try to juggle a database connection
> and then in some other part of the shell script do the data copying.
> But it is quite easy to screw up the error handling or the psql
> session lifetime or something like that, and this would maybe provide
> a nicer interface. 

I think in most cases where this would be useful the user should just be 
using pg_basebackup. If the backup is trying to use snapshots, then 
backup_label needs to be stored outside the snapshot and we won't be 
able to easily help.

> There might be other problems, too. This is just what occurs to me off
> the top of my head. But I think it's an interesting angle to explore
> further.

There may definitely be other problems and I'm pretty sure there will 
be. My feeling is that they will be surmountable, but I won't know for 
sure until I finish the patch.

But I also feel it's a good idea to explore further. I'll work on the 
patch and should have something to share soon.

Regards,
-David



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Add support for AT LOCAL
Next
From: Noah Misch
Date:
Subject: Re: Add support for AT LOCAL