Re: Remove Deprecated Exclusive Backup Mode - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Remove Deprecated Exclusive Backup Mode
Date
Msg-id CAHGQGwHLuG0dc3TRkzJdVhJKUkR9=K3UMBBHvdNtvht=P+z7Eg@mail.gmail.com
Whole thread Raw
In response to Re: Remove Deprecated Exclusive Backup Mode  (David Steele <david@pgmasters.net>)
Responses Re: Remove Deprecated Exclusive Backup Mode
Re: Remove Deprecated Exclusive Backup Mode
List pgsql-hackers
On Tue, Feb 26, 2019 at 3:17 AM David Steele <david@pgmasters.net> wrote:
>
> On 2/25/19 7:50 PM, Fujii Masao wrote:
> > On Mon, Feb 25, 2019 at 10:49 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> >>
> >> I'm not playing devil's advocate here to annoy you.  I see the problems
> >> with the exclusive backup, and I see how it can hurt people.
> >> I just think that removing exclusive backup without some kind of help
> >> like Andres sketched above will make people unhappy.
> >
> > +1
> >
> > Another idea is to improve an exclusive backup method so that it will never
> > cause such issue. What about changing an exclusive backup mode of
> > pg_start_backup() so that it creates something like backup_label.pending file
> > instead of backup_label? Then if the database cluster has backup_label.pending
> > file but not recovery.signal (this is the case where the database is recovered
> > just after the server crashes while an exclusive backup is in progress),
> > in this idea, the recovery using that database cluster always ignores
> > (or removes) backup_label.pending file and start replaying WAL from
> > the REDO location that pg_control file indicates. So this idea enables us to
> > work around the issue that an exclusive backup could cause.
>
> It's an interesting idea.
>
> > On the other hand, the downside of this idea is that the users need to change
> > the recovery procedure. When they want to do PITR using the backup having
> > backup_label.pending, they need to not only create recovery.signal but also
> > rename backup_label.pending to backup_label. Rename of backup_label file
> > is brand-new step for their recovery procedure, and changing the recovery
> > procedure might be painful for some users. But IMO it's less painful than
> > removing an exclusive backup API at all.
>
> Well, given that we have invalidated all prior recovery procedures in
> PG12 I'm not sure how big a deal that is.  Of course, it's too late make
> a change like this for PG12.
>
> > Thought?
>
> Here's the really obvious bad thing: if users do not update their
> procedures and we ignore backup_label.pending on startup then they will
> end up with a corrupt database because it will not replay from the
> correct checkpoint.  If we error on the presence of backup_label.pending
> then we are right back to where we started.

No. In this case, since backup_label.pending and recovery.signal exist,
as I described in my previous post, the server stops the recovery with
PANIC error before corrupting the database. Then the operator can
rename backup_label.pending to backup_label and restart the recovery
safely.

So, let me clarify the situations;

(1) If backup_label and recovery.signal exist, the recovery starts safely.
       This is the normal case of recovery from the base backup.

(2)If backup_label.pending and recovery.signal exist, as described above,
       PANIC error happens at the start of recovery. This case can happen
       if the operator forgets to rename backup_label.pending, i.e.,
       operation mistake. So, after PANIC, the operator needs to fix her or
       his mistake (i.e., rename backup_label.pending) and restart
       the recovery.

(3) If backup_label.pending exists but recovery.signal doesn't, the server
       ignores (or removes) backup_label.pending and do the recovery
       starting the pg_control's REDO location. This case can happen if
       the server crashes while an exclusive backup is in progress.
       So crash-in-the-middle-of-backup doesn't prevent the recovery from
       starting in this idea.

Regards,

-- 
Fujii Masao


pgsql-hackers by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: crosstab/repivot...any interest?
Next
From: Robert Haas
Date:
Subject: Re: ATTACH/DETACH PARTITION CONCURRENTLY