Thread: PITR Backup state validity checking

PITR Backup state validity checking

From
Simon Riggs
Date:
Problem Summary (from previous posts)
The archive recovery must stop AFTER the end of the backup which the
recovery used as its "starting point". If not, incorrect database states
are likely.

In general, this is a small window for error and procedures should exist
to return to the prior backup. Nonetheless, this check should be made
i.e. stop time/point > backup end time

Solution Design:
Before a backup is taken, write a file to data directory that identifies
which backup this is. When the backup is taken this file will be copied
with the backup, and later restored when the backup is restored.
When backup completes write a file to xlog directory that contains the
start backup identifier and the end time. When recovery occurs the
backup identifier can be used to find the end backup file and read this
to find the end backup time.

Additional aspects:
- Can't assume that archive allows direct access, so anything written to
log must be read in sequential order it was written. 
- Backup may be taken when postmaster is down, so solution must not rely
on postmaster being up.
- It *is* posssible to do incremental backups, as long as the backup
checks each file's change data against files already archived (or on
write-once media). The previously backed-up files are thus able to be
considered as being part of *this* backup as well as the one in which
the backup took place. (So we still write start now and end shortly
afterwards).
- We want to offer the user an interface now, so that when later changes
occur, we will not be requiring them to change again.

Implementation Options:
------------------------
User Interface
Two user interfaces have been suggested:
- Write a server function which can be called from anywhere...
- Write an external program

External program will still work when postmaster is down, so is the
option suggested in further detail here...

- Call Sequence
It has been suggested that there should be an "API call" issued before
and after the backup. That requires the user to issue 3 calls in
sequence to get a correct backup.
- For full backups taken all at once, a single call is desirable,
ensuring that no API call was missed.

Implementation Design
----------------------
Implement an external program, called pg_backup. (I guess there's some
historical baggage there, but may be time to leave that behind now....)

pg_backup will do:
1. If postmaster is up, issue a manual CHECKPOINT
2. Write a file called backup_start_<backupid>.info
where <backupid> is the time when backup starts
contains: systemid, time(now)
3. Remove all previous backup_start*.info files
4. Issue the users backup_command via system(3)
5. Write a second file called backup_end_<backupid> to pg_xlog AND write
a backup_end_<backupid>.ready to archive_status. 
backup_end_<backupid> contains: systemid, time of backup end
backup_end_<backupid>.ready is empty

Other changes:
- Alter archiver to always archive backup_end* files first, so they are
written to archive in time sequence order.
- Alter recovery so that it requests backup_end_<backupid> first. We
then read the time in this file and compare with our end time, if there
is one. If there is and we fail the > test, we stop. If no target time
exists, we rollforward though can still fail the test at our selected
stopping point (an Xid).
- If recovery ends at an Xid, but when this is reached we are still less
than backup end time, then we alter our target to being the backup end
time (inclusive) and continue to roll forward. WARNING issued.
- If recovery ends before it has read the backup_end* file then we issue
a WARNING error saying "recovery using incomplete backup", HINT:"you
will need to start recovery from the next earliest backup". (Later
change this to an ERROR and add an option to override and ignore it, for
when you're really up to your neck in it)

pg_backup -opts [BACKUP COMMAND]
opts:
-D    data directory (defaults to PGDATA)

usage examples:
pg_backup tar zcvhf /dev/rmt0 $PGDATA
uses PGDATA to identify data directory, then creates a tape archive on
the default tape device

pg_backup write_to_BAR_system p1 p2 p3

Not hugely happy with the above. I'm sure someone will come up with a
few streamlining comments, eh?

I'd certainly prefer a solution that involved writing WAL records to
indicate start and end, which seems cleaner and more integrated.
However, we need to be able to cater for cold/offline backups.

Best Regards, Simon Riggs




Re: PITR Backup state validity checking

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> Solution Design:
> Before a backup is taken, write a file to data directory that identifies
> which backup this is. When the backup is taken this file will be copied
> with the backup, and later restored when the backup is restored.
> When backup completes write a file to xlog directory that contains the
> start backup identifier and the end time.

End WAL offset, please.  Let's not waste our time with imprecise thinking.
(If you want to throw in the time too, as an aid to the DBA, fine, but
the correctness check wants the WAL position.)

> - Backup may be taken when postmaster is down, so solution must not rely
> on postmaster being up.

Sure it can.  If postmaster is not up then the whole problem is
immaterial, as the WAL state isn't changing relative to the database
state.  (It might be a good idea to try to have some kind of interlock
that prevents someone from trying to start the postmaster while such
a backup is in progress.)

I'm not convinced that we need expend a whole lot of effort on the
point, though, as surely people will prefer to keep their postmasters
running while they take backups.

> - It *is* posssible to do incremental backups, as long as the backup
> checks each file's change data against files already archived (or on
> write-once media).

Hmmm ... if you trust file change dates I suppose this would work,
but it feels shaky ...

> Implement an external program, called pg_backup.

I'd prefer to keep this inside the postmaster, as a separate program
offers a whole new set of failure modes (wrong version, wrong idea about
where PGDATA is, etc etc).

> - Alter archiver to always archive backup_end* files first, so they are
> written to archive in time sequence order.

We cannot use an archive technique that does not support requests for
arbitrary files, so your concern about write ordering seems quite
pointless.  These backup ID files will have roles similar to timeline ID
files, which already require random access.

> I'd certainly prefer a solution that involved writing WAL records to
> indicate start and end, which seems cleaner and more integrated.

But harder to use.  The DBA would find it much more convenient to have
those items of info out in easily readable text files.
        regards, tom lane