Re: Remove Deprecated Exclusive Backup Mode - Mailing list pgsql-hackers

From David Steele
Subject Re: Remove Deprecated Exclusive Backup Mode
Date
Msg-id 8378ce12-6cb0-6fb2-cd0d-0c89232472a0@pgmasters.net
Whole thread Raw
In response to Re: Remove Deprecated Exclusive Backup Mode  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Remove Deprecated Exclusive Backup Mode  (Robert Haas <robertmhaas@gmail.com>)
Re: Remove Deprecated Exclusive Backup Mode  (Fujii Masao <masao.fujii@gmail.com>)
Re: Remove Deprecated Exclusive Backup Mode  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On 2/26/19 6:51 AM, Michael Paquier wrote:
> On Mon, Feb 25, 2019 at 08:17:27PM +0200, David Steele wrote:
>> Here's the really obvious bad thing: if users do not update their procedures
>> and we ignore backup_label.pending on startup then they will end up with a
>> corrupt database because it will not replay from the correct checkpoint.  If
>> we error on the presence of backup_label.pending then we are right back to
>> where we started.
> 
> Not really.  If we error on backup_label.pending, we can make the
> difference between a backend which has crashed in the middle of an
> exclusive backup without replaying anything and a backend which is
> started based on a base backup, so an operator can take some action to
> see what's wrong with the server.  If you issue an error, users can
> also see that their custom backup script is wrong because they forgot
> to rename the flag after taking a backup of the data folder(s).

The operator still has a decision to make, manually, just as they do 
now.  The wrong decision may mean a corrupt database.

Here's the scenario:

1) They do a restore, forget to rename backup_label.pending.
2) Postgres won't start, which is the same action we take now.
3) The user is not sure what to do, rename or delete?  They delete, and 
the cluster is corrupted.

Worse, they have scripted the deletion of backup_label so that the 
cluster will restart on crash.  This is the recommendation from our 
documentation after all.  If that script runs after a restore instead of 
a crash, then the cluster will be corrupt -- silently.

-- 
-David
david@pgmasters.net


pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Reaping Temp tables to avoid XID wraparound
Next
From: "Tsunakawa, Takayuki"
Date:
Subject: RE: [RFC] [PATCH] Flexible "partition pruning" hook