Thread: When it's really necessary to enable WAR archiving in case of low level backups?
When it's really necessary to enable WAR archiving in case of low level backups?
From
Thorsten Schöning
Date:
Hi all, we are currently implementing low level backups using pg_start_backup and pg_stop_backup. While the roadmap already contains moving to pg_basebackup or even barman, I would like to better understand some aspects of what's in use currently. The first thing one is advised to do in case of low level backups is the following: > 1. Ensure that WAL archiving is enabled and working. https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-BASE-BACKUP "archive_command" can be an arbitrary shell command copying WAL segments anywhere and between "pg_start_backup" and "pg_stop_backup" the docs make clear that arbitrary file system tools can be used as well to copy things around. So, when "archive_command" is enabled and working before "pg_start_backup" is executed, one has to deal with the I/O-load of copying the base files and creating the WAL archives at the same time. But those latter archives are only needed before "pg_stop_backup" gets executed, depending on the given arguments even afterwards in theory. But before is enough already to save I/O. So, looking at consistency of the backup, would it be OK to actually only archive WALs when copying base files has finished? Or is there a reason the docs put that at the first advise I obviously didn't understand yet? From my understanding, without actually archving WAL segments, the WAL would simply grow until archiving gets enabled, without any influence on if the backup is consistent or not. The only point is that one must not forget to enable WAL archiving at all and keep those segments created during backup "pg_stop_backup" tells about. Thanks! Mit freundlichen Grüßen, Thorsten Schöning -- Thorsten Schöning E-Mail: Thorsten.Schoening@AM-SoFT.de AM-SoFT IT-Systeme http://www.AM-SoFT.de/ Telefon...........05151- 9468- 55 Fax...............05151- 9468- 88 Mobil..............0178-8 9468- 04 AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow
Re: When it's really necessary to enable WAR archiving in case of lowlevel backups?
From
Jeff Janes
Date:
On Sun, Jul 7, 2019 at 12:15 PM Thorsten Schöning <tschoening@am-soft.de> wrote:
From my understanding, without actually archving WAL segments, the
WAL would simply grow until archiving gets enabled, without any
influence on if the backup is consistent or not.
This is true if you set archive_command to the empty string, or to a command that returns a failure code (like the linux command 'false'). But if you turn archive_mode to off, or if you use an archive_command that always returns success despite not archiving (like the linux command 'true') then the WAL would not be retained but would be lost.
So you would have to go through a complex cycle of having it return success without archiving, then just before the backup change it to return failure, then change it to actually archive and return success, then once all needed WAL is archived change once more to return success without archiving.
That is a lot of complexity. What does it get you? The normal way has the archive happen at the same time as the base backup, and I guess that that network traffic could delay the basebackup itself, which then means you have to archive the WAL that was generated during that delay. That is something, but it doesn't seem like much to justify the complexity.
Cheers,
Jeff
Re: When it's really necessary to enable WAR archiving in case of low level backups?
From
Thorsten Schöning
Date:
Guten Tag Jeff Janes, am Sonntag, 7. Juli 2019 um 19:36 schrieben Sie: > This is true if you set archive_command to the empty string, or to a > command that returns a failure code (like the linux command > 'false').[...] That's what I understood as well, seems I'm not that wrong. > That is a lot of complexity. What does it get you?[...] I'm just arguing with a coworker about how individual aspects of the backup work in theory and how we have implemented things. So thanks for your clarifications. Mit freundlichen Grüßen, Thorsten Schöning -- Thorsten Schöning E-Mail: Thorsten.Schoening@AM-SoFT.de AM-SoFT IT-Systeme http://www.AM-SoFT.de/ Telefon...........05151- 9468- 55 Fax...............05151- 9468- 88 Mobil..............0178-8 9468- 04 AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow
Re: When it's really necessary to enable WAR archiving in case oflow level backups?
From
Stephen Frost
Date:
Greetings, * Thorsten Schöning (tschoening@am-soft.de) wrote: > we are currently implementing low level backups using pg_start_backup > and pg_stop_backup. While the roadmap already contains moving to > pg_basebackup or even barman, I would like to better understand some > aspects of what's in use currently. The first thing one is advised to > do in case of low level backups is the following: I suppose first off, I wouldn't recommend trying to write your own low-level backup tool, especially not as some kind of temporary solution, as it sounds like you're suggesting doing here. What's the issue with using one of the existing tools (of which there's quite a few- pg_basebackup, barman, pgbackrest, wal-g, and more..)? > > 1. Ensure that WAL archiving is enabled and working. > > https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-BASE-BACKUP > > "archive_command" can be an arbitrary shell command copying WAL > segments anywhere and between "pg_start_backup" and "pg_stop_backup" > the docs make clear that arbitrary file system tools can be used as > well to copy things around. While possible to do so, I don't recommend it- as soon as the archive_command returns, PG is free to remove/overwrite the original WAL file, and therefore whatever is in archive_command really needs to guarantee that the WAL file was durably written out (at least fsync'd locally, which something like 'cp' won't do for you, or ideally sent to a remote system and fsync'd there). > So, when "archive_command" is enabled and working before > "pg_start_backup" is executed, one has to deal with the I/O-load of > copying the base files and creating the WAL archives at the same time. The system has to deal with the load (be it I/O or CPU...) of creating the WAL archives and storing them durably during the entire operation of the cluster, if you want point-in-time-recovery. During a base backup, you have the additional load from copying the data files as well, yes. > But those latter archives are only needed before "pg_stop_backup" gets > executed, depending on the given arguments even afterwards in theory. > But before is enough already to save I/O. I'm not following what you're talking about here.. The archives generated during the backup are required when the backup is restored as they're required to get the database back into a consistent state. > So, looking at consistency of the backup, would it be OK to actually > only archive WALs when copying base files has finished? Or is there a > reason the docs put that at the first advise I obviously didn't > understand yet? Typically, point-in-time-recovery (PITR) is a desired part of doing database backups and therefore you need to be archiving every WAL file created. If what you're asking is- can you just have the WAL files saved on the primary during the copying of the data files and then grab them afterwards, then, yes, you can do that (and it's actually exactly what pg_basebackup does in some modes). I haven't heard of that being a requirement before- it was done in pg_basebackup for implementation reasons, as I understand it, and not really because it was a much needed feature. > From my understanding, without actually archving WAL segments, the > WAL would simply grow until archiving gets enabled, without any > influence on if the backup is consistent or not. The only point is > that one must not forget to enable WAL archiving at all and keep > those segments created during backup "pg_stop_backup" tells about. The WAL files required to make a backup consistent are *all* of those generated between the pg_start_backup and the pg_stop_backup, just to be clear. Thanks, Stephen