Re: The danger of deleting backup_label - Mailing list pgsql-hackers

From David Steele
Subject Re: The danger of deleting backup_label
Date
Msg-id 0f948866-7caf-0759-d53c-93c3e266ec3f@pgmasters.net
Whole thread Raw
In response to Re: The danger of deleting backup_label  (David Steele <david@pgmasters.net>)
Responses Re: The danger of deleting backup_label
Re: The danger of deleting backup_label
List pgsql-hackers
On 10/14/23 11:30, David Steele wrote:
> On 10/12/23 10:19, David Steele wrote:
>> On 10/11/23 18:10, Thomas Munro wrote:
>>
>>> As Stephen mentioned[1], we could perhaps also complain if both backup
>>> label and control file exist, and then hint that the user should
>>> remove the *control file* (not the backup label!).  I had originally
>>> suggested we would just overwrite the control file, but by explicitly
>>> complaining about it we would also bring the matter to tool/script
>>> authors' attention, ie that they shouldn't be backing that file up, or
>>> should be removing it in a later step if they copy everything.  He
>>> also mentions that there doesn't seem to be anything stopping us from
>>> back-patching changes to the backup label contents if we go this way.
>>> I don't have a strong opinion on that and we could leave the question
>>> for later.
>>
>> I'm worried about the possibility of back patching this unless the 
>> solution comes out to be simpler than I think and that rarely comes to 
>> pass. Surely throwing errors on something that is currently valid 
>> (i.e. backup_label and pg_control both present).
>>
>> But perhaps there is a simpler, acceptable solution we could back 
>> patch (transparent to all parties except Postgres) and then a more 
>> advanced solution we could go forward with.
>>
>> I guess I had better get busy on this.
> 
> Attached is a very POC attempt at something along these lines that could 
> be back patched. I stopped when I realized excluding pg_control from the 
> backup is a necessity to make this work (else we still could end up with 
> a torn copy of pg_control even if there is a good copy elsewhere). To 
> enumerate the back patch issues as I see them:

Given that the above can't be back patched, I'm thinking we don't need 
backup_label at all going forward. We just write the values we need for 
recovery into pg_control and return *that* from pg_backup_stop() and 
tell the user to store it with their backup. We already have "These 
files are vital to the backup working and must be written byte for byte 
without modification, which may require opening the file in binary 
mode." in the documentation so dealing with pg_control should not be a 
problem. pg_control also has a CRC so we will know if it gets munged.

It doesn't really matter where/how they store pg_control as long as it 
is written back into PGDATA before the cluster starts. If 
backupEndRequired, etc., are set appropriately then recovery will do the 
right thing when it starts, just as now if PG crashes after it has 
renamed backup_label but before recovery to consistency has completed.

We can still enforce the presence of recovery.signal by checking 
backupEndRequired if that's something we want but it seems like 
backupEndRequired would be enough. I'm fine either way.

Another thing we can do here is make backup from standby easier. The 
user won't need to worry about *when* pg_control is copied. We can just 
write the ideal min recovery point into pg_control.

Any informational data currently in backup_label can be returned as 
columns (like the start/end lsn is now).

This makes the patch much less invasive and while it definitely, 
absolutely cannot be back patched, it seems like a good way forward.

This is the direction I'm planning to work on patch-wise but I'd like to 
hear people's thoughts.

Regards,
-David



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_dump needs SELECT privileges on irrelevant extension table
Next
From: Stephen Frost
Date:
Subject: Re: [PoC/RFC] Multiple passwords, interval expirations