Re: backup_label during crash recovery: do we know how to solve it? - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: backup_label during crash recovery: do we know how to solve it?
Date
Msg-id 4F005D18.6080204@enterprisedb.com
Whole thread Raw
In response to Re: backup_label during crash recovery: do we know how to solve it?  (Daniel Farina <daniel@heroku.com>)
Responses Re: backup_label during crash recovery: do we know how to solve it?  (Magnus Hagander <magnus@hagander.net>)
Re: backup_label during crash recovery: do we know how to solve it?  (Daniel Farina <daniel@heroku.com>)
List pgsql-hackers
On 30.12.2011 02:40, Daniel Farina wrote:
> How about this revised protocol (names and adjustments welcome), to
> enable a less-terrible approach?  Not only is that workaround
> incorrect (it has a small window where the system will not be able to
> restart), but it's pretty inconvenient.
>
> New concepts:
>
> pg_prepare_backup: readies postgres for backing up.  Saves the
> backup_label content in volatile memory.  The next start_backup will
> write that volatile information to disk, and the information within
> can be used to compute a "backup-key"
>
> "backup-key": a subset of the backup label, all it needs (as far as I
> know) might be the database-id and then the WAL position (timeline,
> seg, offset) the backup is starting at.
>
> Protocol:
>
> 1. select pg_prepare_backup();
> (Backup process remembers that backup-key is in progress (say, writes
> it to /backup-keys/%k)
> 2. select pg_start_backup();
> (perform copying)
> 3. select pg_stop_backup();
> 4. backup process can optionally clear its state remembering the
> backup-key (rm /backup-keys/%k)
>
> A crash at each point would be resolved this way:
>
> Before step 1: Nothing has happened, so normal crash recovery.
>
> Before step 2: (same, as it doesn't involve a state transition in postgres)
>
> Before step 3: when the crash occurs and postgres starts up, postgres
> asks the external software if a backup was in progress, say via a
> "backup-in-progress command".  It is responsible for looking at
> /backup-keys/%k and saying "yes, it was". The database can then do
> normal crash recovery.  The backup can even be continuing through this
> time, I think.
>
> Before step 4: The archiver may leak the backup-key.  Because
> backup-keys using the information I defined earlier have an ordering,
> it should be possible to reap these if necessary at intervals.
>
> Fundamentally, the way this approach gets around the 'physical copy'
> conundrum is asking the archiver software to remember something well
> out of the way of the database directory on the system that is being
> backed up.

That's awfully complicated. If we're going to require co-operation from 
the backup/archiving software, we might as well just change the 
procedure so that backup_label is not stored in the data directory, but 
returned by pg_start/stop_backup(), and the caller is responsible for 
placing it in the backed up copy of the data directory (or provide a new 
version of them to retain backwards compatibility). That would be a lot 
simpler.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: review: CHECK FUNCTION statement
Next
From: Magnus Hagander
Date:
Subject: Re: backup_label during crash recovery: do we know how to solve it?