Thread: When it's really necessary to enable WAR archiving in case of low level backups?

When it's really necessary to enable WAR archiving in case of low level backups?

From

Thorsten Schöning

Date:

07 July 2019, 16:14:49

Hi all,

we are currently implementing low level backups using pg_start_backup
and pg_stop_backup. While the roadmap already contains moving to
pg_basebackup or even barman, I would like to better understand some
aspects of what's in use currently. The first thing one is advised to
do in case of low level backups is the following:

> 1. Ensure that WAL archiving is enabled and working.

https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-BASE-BACKUP

"archive_command" can be an arbitrary shell command copying WAL
segments anywhere and between "pg_start_backup" and "pg_stop_backup"
the docs make clear that arbitrary file system tools can be used as
well to copy things around.

So, when "archive_command" is enabled and working before
"pg_start_backup" is executed, one has to deal with the I/O-load of
copying the base files and creating the WAL archives at the same time.
But those latter archives are only needed before "pg_stop_backup" gets
executed, depending on the given arguments even afterwards in theory.
But before is enough already to save I/O.

So, looking at consistency of the backup, would it be OK to actually
only archive WALs when copying base files has finished? Or is there a
reason the docs put that at the first advise I obviously didn't
understand yet?

From my understanding, without actually archving WAL segments, the
WAL would simply grow until archiving gets enabled, without any
influence on if the backup is consistent or not. The only point is
that one must not forget to enable WAL archiving at all and keep
those segments created during backup "pg_stop_backup" tells about.

Thanks!

Mit freundlichen Grüßen,

Thorsten Schöning

--
Thorsten Schöning       E-Mail: Thorsten.Schoening@AM-SoFT.de
AM-SoFT IT-Systeme      http://www.AM-SoFT.de/

Telefon...........05151-  9468- 55
Fax...............05151-  9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow

Re: When it's really necessary to enable WAR archiving in case of lowlevel backups?

From

Jeff Janes

Date:

07 July 2019, 17:36:39

On Sun, Jul 7, 2019 at 12:15 PM Thorsten Schöning <tschoening@am-soft.de> wrote:

From my understanding, without actually archving WAL segments, the
WAL would simply grow until archiving gets enabled, without any
influence on if the backup is consistent or not.

This is true if you set archive_command to the empty string, or to a command that returns a failure code (like the linux command 'false'). But if you turn archive_mode to off, or if you use an archive_command that always returns success despite not archiving (like the linux command 'true') then the WAL would not be retained but would be lost.

So you would have to go through a complex cycle of having it return success without archiving, then just before the backup change it to return failure, then change it to actually archive and return success, then once all needed WAL is archived change once more to return success without archiving.

That is a lot of complexity. What does it get you? The normal way has the archive happen at the same time as the base backup, and I guess that that network traffic could delay the basebackup itself, which then means you have to archive the WAL that was generated during that delay. That is something, but it doesn't seem like much to justify the complexity.

Cheers,

Jeff

Re: When it's really necessary to enable WAR archiving in case of low level backups?

From

Thorsten Schöning

Date:

08 July 2019, 07:41:46

Guten Tag Jeff Janes,
am Sonntag, 7. Juli 2019 um 19:36 schrieben Sie:

> This is true if you set archive_command to the empty string, or to a
> command that returns a failure code (like the linux command
> 'false').[...]

That's what I understood as well, seems I'm not that wrong.

> That is a lot of complexity.  What does it get you?[...]

I'm just arguing with a coworker about how individual aspects of the
backup work in theory and how we have implemented things. So thanks
for your clarifications.

Mit freundlichen Grüßen,

Thorsten Schöning

--
Thorsten Schöning       E-Mail: Thorsten.Schoening@AM-SoFT.de
AM-SoFT IT-Systeme      http://www.AM-SoFT.de/

Telefon...........05151-  9468- 55
Fax...............05151-  9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow

Re: When it's really necessary to enable WAR archiving in case oflow level backups?

From

Stephen Frost

Date:

08 July 2019, 14:49:11

Greetings,

* Thorsten Schöning (tschoening@am-soft.de) wrote:
> we are currently implementing low level backups using pg_start_backup
> and pg_stop_backup. While the roadmap already contains moving to
> pg_basebackup or even barman, I would like to better understand some
> aspects of what's in use currently. The first thing one is advised to
> do in case of low level backups is the following:

I suppose first off, I wouldn't recommend trying to write your own
low-level backup tool, especially not as some kind of temporary
solution, as it sounds like you're suggesting doing here.

What's the issue with using one of the existing tools (of which there's
quite a few- pg_basebackup, barman, pgbackrest, wal-g, and more..)?

> > 1. Ensure that WAL archiving is enabled and working.
>
> https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-BASE-BACKUP
>
> "archive_command" can be an arbitrary shell command copying WAL
> segments anywhere and between "pg_start_backup" and "pg_stop_backup"
> the docs make clear that arbitrary file system tools can be used as
> well to copy things around.

While possible to do so, I don't recommend it- as soon as the
archive_command returns, PG is free to remove/overwrite the original WAL
file, and therefore whatever is in archive_command really needs to
guarantee that the WAL file was durably written out (at least fsync'd
locally, which something like 'cp' won't do for you, or ideally sent to
a remote system and fsync'd there).

> So, when "archive_command" is enabled and working before
> "pg_start_backup" is executed, one has to deal with the I/O-load of
> copying the base files and creating the WAL archives at the same time.

The system has to deal with the load (be it I/O or CPU...) of creating
the WAL archives and storing them durably during the entire operation of
the cluster, if you want point-in-time-recovery.  During a base backup,
you have the additional load from copying the data files as well, yes.

> But those latter archives are only needed before "pg_stop_backup" gets
> executed, depending on the given arguments even afterwards in theory.
> But before is enough already to save I/O.

I'm not following what you're talking about here..  The archives
generated during the backup are required when the backup is restored as
they're required to get the database back into a consistent state.

> So, looking at consistency of the backup, would it be OK to actually
> only archive WALs when copying base files has finished? Or is there a
> reason the docs put that at the first advise I obviously didn't
> understand yet?

Typically, point-in-time-recovery (PITR) is a desired part of doing
database backups and therefore you need to be archiving every WAL file
created.

If what you're asking is- can you just have the WAL files saved on the
primary during the copying of the data files and then grab them
afterwards, then, yes, you can do that (and it's actually exactly what
pg_basebackup does in some modes).  I haven't heard of that being a
requirement before- it was done in pg_basebackup for implementation
reasons, as I understand it, and not really because it was a much needed
feature.

> From my understanding, without actually archving WAL segments, the
> WAL would simply grow until archiving gets enabled, without any
> influence on if the backup is consistent or not. The only point is
> that one must not forget to enable WAL archiving at all and keep
> those segments created during backup "pg_stop_backup" tells about.

The WAL files required to make a backup consistent are *all* of those
generated between the pg_start_backup and the pg_stop_backup, just to be
clear.

Thanks,

Stephen

Attachment

signature.asc