PITR Archive Recovery - Mailing list pgsql-patches
From | Simon Riggs |
---|---|
Subject | PITR Archive Recovery |
Date | |
Msg-id | 1088456336.3266.6047.camel@stromboli Whole thread Raw |
Responses |
Re: PITR Archive Recovery
|
List | pgsql-patches |
PITR Archive Recovery, 28 June 2004 What's in this patch? - All of what was in previous PITR Archival patch, including reworking of all the archiver startup/shutdown code to match that of pgstat - New code to perform Archive Recovery mode, which streams xlogs straight from archive to allow "infinite" recovery [This is a full, working patch for discussion and testing, with a few days left before 7.5dev freeze for changes and corrections] Archive Recovery Overview The rule is: when you startup with archive_mode on, take a backup. This patch provides an additional mode for transactional log recovery. The current mode provides Crash Recovery, allowing the database to restart and to read the last few online transaction logs to recover. The additional mode is known as Archive Recovery, enabled in postgresql.conf using archive_mode=true. The administrator also specifies and archive location and a program to perform the copying (discussed later). As database writes occur and transaction logs fill they are copied to the archive location by a new sub-process of the postmaster, called the archiver process. If the database crashes when in archive_mode, it will recover just as before. If a major system crash, or other reason to restore occurs, then the new Archive Recovery mode can be invoked by: - restoring the full physical backup - creating DataDir/recovery.conf - restart PostgreSQL PostgreSQL will recover using almost the same code path as before, except: all transaction logs will be restored for use directly from archive in a stream that prevents disk space from overflowing. Extended recovery is possible, over many transaction logs - far more than was previously possible. The essential aspect to recovery is maintaining a solid backup/log chain. This must consist of: i) a full backup of every file in PostgreSQL - miss none! ii) archived transaction logs from just before backup taken to whenever you want to recover to... If you "break the chain" i.e. the two sets of files don't match, or you've deleted a transaction log or lost a tape - then you will only be able to recover up to the point the chain broke. The longer the chain, the easier it will be to break, so backup regularly and work out a robust archiving policy. Extensibility Archive Recovery allows the administrator to specify a program or script to use when both archiving and restoring transaction log files. As a result, the administrator can choose to store transaction log files on any drive, any other system or integrating with any Backup Archive Recovery (BAR) software. The Archive Program is provided with 3 parameters, which can be placed anywhere within the supplied command string. Each parameter is represented by a "%s" character string. - The first parameter is replaced with the full path of the transaction log to be archived. - The second parameter is replaced with the value of the archive_dest parameter e.g. with these settings archive_mode = true archive_dest = '/mount/disk2/pgarchive' archive_program = 'cp %s %s' with transaction log 00000000000003A4 to archive with DataDir of web123 would execute the following command 'cp /usr/local/pgsql/web123/00000000000003A4 /disk2/pgarchive' To enter Archive Recovery the administrator must create a recovery.conf file. [Currently an empty file will do, though intended as the one place where all recovery options and parameters would be set] The Restore Program is provided with 2 parameters - The first parameter is replaced with the value of the archive_dest parameter - The second parameter is replaced with the value of the transaction log which recovery is requesting should now be restored - The third parameter is replaced with the full path of the filename which should be the target of the single file recovery operation. [In this patch, XlogArchRestoreProgram is hardcoded to "cp %s/%s %s", pending short discussion on how to specify this...] Possibilities Using the current features, it is possible to implement an Automated Standby Database. This is an active-passive High Availability option. In this mode, the main server sends archived log files to a second, standby server. The standby server is set up to be in "permanent recovery", by using a RestoreProgram that waits for each file to be shipped to it. The standby system receives each file, then recovers up to that point - so the standby system is always a few seconds from completing its startup should it be required. Patch Status & Current Caveats - A number of recovery scenarios have been tested and the patch is believed to be stable and ready for others to begin commentary and testing...please understand that there are many scenarios that work and many that do not...these last are not bugs - Recovery to a specific point in time is not yet implemented. The administrator has the following recovery options: --can recover the system fully, all the way to the end of logs --stop recovery by withholding log files at an appropriate place, thereby forcing the termination of recovery How it Works archive_debug is a hidden postgresql.conf setting which can be used to show more debug output for the archive recovery facilities. ...archive_debug = true [This currently generates an additional log file called recovery.log, showing the record headers and types of all restored xlog records.] Each PostgreSQL user has a corresponding backend process performing work for them. These backends write any transaction log data to the log as each operation occurs, then marks transactions as either committed or aborted. The log is split into log segment files. When each log file fills a notification file is created in archive_status, and at the same time a signal is sent to the archiver (via the postmaster) to begin archiving. The archiver executes the administrator's program or script using a system(3) call, archiving each file one at a time. The notification file provides a number of features. First, its simplicity ensures that we don't need a shared memory link between each backend and the archiver, which would then be prone to failure - which is exactly not what we want. Second, the archive_status file is written whether or not the postmaster is still up, so will still function correctly even when the worst has happened. Third, as a persistent record, it will allow archiving to restart at the same point later on (even after a backup/restore), allowing it to recover gracefully from administrator's immediate shutdown requests as well as postmaster or archive system failures. When recovery completes, archive_status files are written to ensure cleanup of all transaction logs leaves the database in a fit state for production. The recovery.conf file is removed to ensure that a subsequent crash doesn't accidentally begin recovery from archives... The archiver process starts only when recovery has fully completed. How to Fail All of the following ways to fail have already occurred in testing... 1. if you do all 3 of these, you will not have a backup/log chain: a) use the 'cp' command, or any other command that can write to either a file or a directory b) use archive_dest to specify a directory c) forget to create the directory, which the command in a) then interprets as a file and then all archived logs overwrite each other and break the chain. You will be unable to rollforward AT ALL. 2. if you do all 5 of these, you will not rollforward very far, if any. a) restore the database backup b) forget to specify a recovery.conf file c) startup and let database recover using only the xlogs that were present when the backup was taken d) then immediately after recovery run a job which causes xlog to be written, and then a file to fill e) use an archive command that allows overwriting of previously archived files This sequence causes the database to recover successfully, but not using your full archive chain. The current redo pointer points half way through your backup/log chain, so when the file is archived, it overwrites a file in the middle of the chain, thus breaking the chain. There isn't an option on cp that will prevent this... 3. turn archive_mode on and off... Turning archive_mode off means that when an xlog fills, it won't EVER get archived. This will break the backup/log chain. If you turn archive_mode back on, and an xlog filled, then it will not be picked up as having filled and you will be missing a link in the chain. Sometimes you'll get away with it and then you'll think it will work all the time. The rule is: when you startup with archive_mode on, take a backup. 4. Forget that xlogs are the same size, whether they are full or not...you can't tell (yet) by looking at one whether it has a record within it that will break the chain How to succeed - Test your recovery procedures before you need them - When you're in a recovery situation backup everything you can lay your hands on, to make sure you have a known position to return to while you ATTEMPT recovery - you may make a mistake and need to retry - be cool - we all make mistakes, just don't let errors multiply - disable listen_addresses in postgresql.conf when you recover, to give yourself some breathing space to check things, when you finally do get the database ready - recover with archive_debug = true to give me/others a chance to debug any problems or answer any questions you may have The rule is: when you startup with archive_mode on, take a backup. When enabling archive_mode for the first time, if you have run previously without enabling archive_mode, then when you will inevitably get messages saying "cannot find archive_status file". After you have taken a backup, you can manually create files in the archive_status directory of <xlog>.done, which will then allow the bgwriter to clean them up when it next takes a checkpoint. Additional Work - recovery.conf needs some minor work to specify recover options. It is envisaged that this would be a short bison grammar, very very similar to postgresql.conf - It seems possible to easily work around the requirement to take the backup while the database is open. It would be a good option to have both hot and cold backup options... - full documentation will also follow shortly Credits This work is the final push in a long series of patches and discussions about how to achieve archive recovery with PostgreSQL. The work of J.R.Nield and Patrick MacDonald has provided the detailed underpinnings for this recent work, which in turn rests upon Vadim Mikheev's original work on WAL and MVCC. Tom Lane has provided considerable technical assistance and quality review, whilst Bruce Momjian has provided many of the ideas and smoothed the way for much of the work. Thanks, all. Simon Riggs, simon@2ndquadrant.com
Attachment
pgsql-patches by date: