Thread: The backup API and general purpose backup software
This is inspired by the thread with the subject "Something else about Redo Logs disappearing", but since that thread is already quite long, since I have lost track what exactly "Peter"'s problem is and since his somewhat belligerent tone makes it unappealing to reread the whole thread I'll attempt a fresh start. To make a full backup with the "new" (non-exclusive) API, a software must do the following 1. open a connection to the database 2. invoke pg_start_backup('label', false, false) in the connection from step 1. 3. copy the contents of the data directory to the backup medium 4. invoke pg_stop_backup(false, true) in the connection from step 1. 5. copy the result from the previous step to the backup medium. (It is assumed that any archived WALs are also archived in a manner that they can be restored together with this backup. If this is not the case adding them to the backup would be step 6.) So far so good and writing a program which implements this should not pose a great difficulty. General purpose backup software often assumes that it can perform a backup in three steps: 1. Invoke a pre-backup script. 2. Copy files to the backup medium in such a way that they can be identified as a group and be restored together and without mixing them with data from another backup session. 3, Invoke a post-backup script. (I have to admit that is has been a long time since I've looked at any backup system in any detail. Doubtlessly many have a more complicated model, but I'm fairly confident that this is still the lowest common denominator.) Now we have two problems. The first is that the pre-backup script has to exit before the proper backup can begin, but it also has to open a connection which will stay open during the backup because it will be needed in the post-backup script again. This can be solved by starting a background process which keeps the connection open and providing some communication channel (perhaps a Unix pipe) for the post-backup script to communicate with this background process. A bit awkward, but no great deal. The second problem is that the post-backup script will only be called after the backup is finished. So the information returned by pg_stop_backup cannot be included in the backup which makes the backup useless. This is indeed serious, and I don't see a way around this in this simple model. But there is a workaround: Split the backup. I am assuming that the backup software uses a unique ID to identify each backup and passes that ID to the pre-backup and post-backup script. This ID is used as the label in the call to pg_start_backup(). It is also included in the information returned by pg_stop_backup(). So the post-backup script stores that information in a different place (maybe including the ID in the filename(s) to avoid conflicts and for redundancy) and then triggers a backup of that place (or alternatively that can be backed up independently). To restore a backup, you will then also need two steps: 1) Restore the summary information from the second backup. Inspect the backup_label to find the ID of the backup of the data. 2) Restore that backup (3) Put the backup label where it belongs, make sure the archived WALs are accessible and start the database hp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp@hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!"
Attachment
Peter, I don't understand that last step "5. copy the result from the previous step to the backup medium." It seems to be a duplication of "3. copy the contents of the data directory to the backup medium". On 6/21/20 8:28 AM, Peter J. Holzer wrote: > This is inspired by the thread with the subject "Something else about > Redo Logs disappearing", but since that thread is already quite long, > since I have lost track what exactly "Peter"'s problem is and since his > somewhat belligerent tone makes it unappealing to reread the whole > thread I'll attempt a fresh start. > > To make a full backup with the "new" (non-exclusive) API, a software > must do the following > > 1. open a connection to the database > > 2. invoke pg_start_backup('label', false, false) in the connection from > step 1. > > 3. copy the contents of the data directory to the backup medium > > 4. invoke pg_stop_backup(false, true) in the connection from step 1. > > 5. copy the result from the previous step to the backup medium. > > (It is assumed that any archived WALs are also archived in a manner that > they can be restored together with this backup. If this is not the case > adding them to the backup would be step 6.) > > So far so good and writing a program which implements this should not > pose a great difficulty. > > General purpose backup software often assumes that it can perform a > backup in three steps: > > 1. Invoke a pre-backup script. > > 2. Copy files to the backup medium in such a way that they can be > identified as a group and be restored together and without mixing > them with data from another backup session. > > 3, Invoke a post-backup script. > > (I have to admit that is has been a long time since I've looked at any > backup system in any detail. Doubtlessly many have a more complicated > model, but I'm fairly confident that this is still the lowest common > denominator.) > > Now we have two problems. > > The first is that the pre-backup script has to exit before the proper > backup can begin, but it also has to open a connection which will stay > open during the backup because it will be needed in the post-backup > script again. This can be solved by starting a background process which > keeps the connection open and providing some communication channel > (perhaps a Unix pipe) for the post-backup script to communicate with > this background process. A bit awkward, but no great deal. > > The second problem is that the post-backup script will only be called > after the backup is finished. So the information returned by pg_stop_backup > cannot be included in the backup which makes the backup useless. This is > indeed serious, and I don't see a way around this in this simple model. > > But there is a workaround: Split the backup. > > I am assuming that the backup software uses a unique ID to identify each > backup and passes that ID to the pre-backup and post-backup script. > This ID is used as the label in the call to pg_start_backup(). > It is also included in the information returned by pg_stop_backup(). So > the post-backup script stores that information in a different place > (maybe including the ID in the filename(s) to avoid conflicts and for > redundancy) and then triggers a backup of that place (or alternatively > that can be backed up independently). > > To restore a backup, you will then also need two steps: > > 1) Restore the summary information from the second backup. Inspect the > backup_label to find the ID of the backup of the data. > > 2) Restore that backup > > (3) Put the backup label where it belongs, make sure the archived WALs > are accessible and start the database > > hp > > > -- Angular momentum makes the world go 'round.
On 2020-06-21 10:32:16 -0500, Ron wrote: > On 6/21/20 8:28 AM, Peter J. Holzer wrote: > > To make a full backup with the "new" (non-exclusive) API, a software > > must do the following > > > > 1. open a connection to the database > > > > 2. invoke pg_start_backup('label', false, false) in the connection from > > step 1. > > > > 3. copy the contents of the data directory to the backup medium > > > > 4. invoke pg_stop_backup(false, true) in the connection from step 1. > > > > 5. copy the result from the previous step to the backup medium. > > I don't understand that last step "5. copy the result from the previous step > to the backup medium." It seems to be a duplication of "3. copy the > contents of the data directory to the backup medium". Step three copies all the files from the data directory. However, there is an additional piece of information you you need to recover the database, and that is only returned by the call to pg_stop_backup: postgres=# select * from pg_stop_backup(false, true); ╔═[ RECORD 1 ]═════════════════════════════════════════════════════════════════╗ ║ lsn │ A9/2F000130 ║ ║ labelfile │ START WAL LOCATION: A9/2F000028 (file 00000001000000A90000002F)↵║ ║ │ CHECKPOINT LOCATION: A9/2F000060 ↵║ ║ │ BACKUP METHOD: streamed ↵║ ║ │ BACKUP FROM: master ↵║ ║ │ START TIME: 2020-06-21 17:36:30 CEST ↵║ ║ │ LABEL: backup # 1234 ↵║ ║ │ START TIMELINE: 1 ↵║ ║ │ ║ ║ spcmapfile │ 17522 /home/postgres ↵║ ║ │ ║ ╚════════════╧═════════════════════════════════════════════════════════════════╝ Without this information the backup is useless, so it must be backed up, too. Obviously you can do that only after you got it. hp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp@hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!"
Attachment
On 6/21/20 10:45 AM, Peter J. Holzer wrote: > On 2020-06-21 10:32:16 -0500, Ron wrote: >> On 6/21/20 8:28 AM, Peter J. Holzer wrote: >>> To make a full backup with the "new" (non-exclusive) API, a software >>> must do the following >>> >>> 1. open a connection to the database >>> >>> 2. invoke pg_start_backup('label', false, false) in the connection from >>> step 1. >>> >>> 3. copy the contents of the data directory to the backup medium >>> >>> 4. invoke pg_stop_backup(false, true) in the connection from step 1. >>> >>> 5. copy the result from the previous step to the backup medium. >> I don't understand that last step "5. copy the result from the previous step >> to the backup medium." It seems to be a duplication of "3. copy the >> contents of the data directory to the backup medium". > Step three copies all the files from the data directory. However, there > is an additional piece of information you you need to recover the > database, and that is only returned by the call to pg_stop_backup: > > postgres=# select * from pg_stop_backup(false, true); > ╔═[ RECORD 1 ]═════════════════════════════════════════════════════════════════╗ > ║ lsn │ A9/2F000130 ║ > ║ labelfile │ START WAL LOCATION: A9/2F000028 (file 00000001000000A90000002F)↵║ > ║ │ CHECKPOINT LOCATION: A9/2F000060 ↵║ > ║ │ BACKUP METHOD: streamed ↵║ > ║ │ BACKUP FROM: master ↵║ > ║ │ START TIME: 2020-06-21 17:36:30 CEST ↵║ > ║ │ LABEL: backup # 1234 ↵║ > ║ │ START TIMELINE: 1 ↵║ > ║ │ ║ > ║ spcmapfile │ 17522 /home/postgres ↵║ > ║ │ ║ > ╚════════════╧═════════════════════════════════════════════════════════════════╝ > > Without this information the backup is useless, so it must be backed > up, too. Obviously you can do that only after you got it. Thanks. A clearer instruction for step 4, I'd have said something like "invoke pg_stop_backup(false, true)in the connection from step 1, piping the output to a file on the backup medium." -- Angular momentum makes the world go 'round.
On 2020-06-21 17:35:41 -0500, Ron wrote: > On 6/21/20 10:45 AM, Peter J. Holzer wrote: > > On 2020-06-21 10:32:16 -0500, Ron wrote: > > > On 6/21/20 8:28 AM, Peter J. Holzer wrote: > > > > To make a full backup with the "new" (non-exclusive) API, a software > > > > must do the following > > > > > > > > 1. open a connection to the database > > > > > > > > 2. invoke pg_start_backup('label', false, false) in the connection from > > > > step 1. > > > > > > > > 3. copy the contents of the data directory to the backup medium > > > > > > > > 4. invoke pg_stop_backup(false, true) in the connection from step 1. > > > > > > > > 5. copy the result from the previous step to the backup medium. > > > I don't understand that last step "5. copy the result from the previous step > > > to the backup medium." It seems to be a duplication of "3. copy the > > > contents of the data directory to the backup medium". > > Step three copies all the files from the data directory. However, there > > is an additional piece of information you you need to recover the > > database, and that is only returned by the call to pg_stop_backup: > > > > postgres=# select * from pg_stop_backup(false, true); > > ╔═[ RECORD 1 ]═════════════════════════════════════════════════════════════════╗ [...] > > ╚════════════╧═════════════════════════════════════════════════════════════════╝ > > > > Without this information the backup is useless, so it must be backed > > up, too. Obviously you can do that only after you got it. > > Thanks. A clearer instruction for step 4, I'd have said something like > "invoke pg_stop_backup(false, true)in the connection from step 1, piping the > output to a file on the backup medium." I could have phrased it that way but I wanted to set the stage for the second half of the posting, where I was contrasting that to the work flow of a typical general purpose backup program. There "piping the output to a file on the backup medium" isn't possible because the backup is already finished. So it made sense for me to present invoking pg_stop_backup and writing the result of pg_stop_backup to the backup medium as two distinct steps. I think this also makes the solution much more natural (some might even say obvious) than if those had been presented as a single step. hp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp@hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!"