Thread: [GENERAL] Backup strategy using 'wal_keep_segments'

[GENERAL] Backup strategy using 'wal_keep_segments'

From
Rhhh Lin
Date:

Hi,


Version 9.4...

Per the PG docs, to facilitate continuous WAL archiving and PITR recovery... 
"To enable WAL archiving, set the wal_level configuration parameter to archive (or hot_standby), archive_mode to on, and specify the shell command to use in the archive_command configuration parameter."

This instruction is fine and I have a solid understanding of the implementation of these parameters, what they do and the why behind them, i.e. in order to maintain a continuous chain of database changes in the WAL stream.

My question however is this...
A colleague recently suggested that instead of implementing an 'archive_command' to push archivable WALs to a secondary location (for further backup to tape for example), we could instead persist the WAL files in their current location by setting the "wal_keep_segments" parameter to an extreme value e.g. 1000 and have the 'archive_command' do nothing.

So, something like...
wal_keep_segments=1000
archive_command='cd .'

And then periodically copy the archived WAL_files from the pg_xlog directory out to tape, removing as we go?

Is this approach feasible? Assuming obviously, we have sufficient disk space to facilitate 1000 WAL files etc.
But from a point-in-time recovery, and backup perspective - are we missing anything if we were to adopt this non-standard approach?

Regards

Re: [GENERAL] Backup strategy using 'wal_keep_segments'

From
Michael Paquier
Date:
On Mon, Oct 23, 2017 at 5:57 AM, Rhhh Lin <ruanlinehan@hotmail.com> wrote:
> Is this approach feasible? Assuming obviously, we have sufficient disk space
> to facilitate 1000 WAL files etc.

You expose yourself to race conditions with such methods if a
checkpoint has the bad idea to recycle past segments that your logic
is copying. So I would advise to not do that. Instead of using the
archive command, you should also consider using pg_receivexlog
combined with a replication slot. This brings way more control with
the error handling.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Backup strategy using 'wal_keep_segments'

From
Rhhh Lin
Date:

Thanks very much for your reply Michael.

I note that it looks like pgbarman employs pg_receivexlog; I will check it out. 


Regards,

Ruan


From: Michael Paquier <michael.paquier@gmail.com>
Sent: 22 October 2017 22:17:01
To: Rhhh Lin
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Backup strategy using 'wal_keep_segments'
 
On Mon, Oct 23, 2017 at 5:57 AM, Rhhh Lin <ruanlinehan@hotmail.com> wrote:
> Is this approach feasible? Assuming obviously, we have sufficient disk space
> to facilitate 1000 WAL files etc.

You expose yourself to race conditions with such methods if a
checkpoint has the bad idea to recycle past segments that your logic
is copying. So I would advise to not do that. Instead of using the
archive command, you should also consider using pg_receivexlog
combined with a replication slot. This brings way more control with
the error handling.
--
Michael

Re: [GENERAL] Backup strategy using 'wal_keep_segments'

From
Stephen Frost
Date:
Greetings,

* Rhhh Lin (ruanlinehan@hotmail.com) wrote:
> A colleague recently suggested that instead of implementing an 'archive_command' to push archivable WALs to a
secondarylocation (for further backup to tape for example), we could instead persist the WAL files in their current
locationby setting the "wal_keep_segments" parameter to an extreme value e.g. 1000 and have the 'archive_command' do
nothing.

Michael's points are good and I wouldn't recommend using this archive
command either, but what isn't clear to me is what you're actaully
trying to solve by using such a method..?  You haven't said anywhere
what's wrong with archive_command (I know that there certainly are some
things wrong with it, of course, but there are solutions to a number of
those issues that isn't a hack like this ...).

Thanks!

Stephen

Re: [GENERAL] Backup strategy using 'wal_keep_segments'

From
Rhhh Lin
Date:

Hi Stephen,


Thanks for replying.

I would actually be an advocate for using a proper archive_command in order to facilitate a proper (Per the documentation) PITR and backup strategy.


However, a colleague had suggested such a creative approach (Possibly, less administrative overhead, ease of maintenance? I'm not sure) and I struggled to find any evidence online in blogs/white-papers/documentation that this was a feasible approach. That said, I couldn't find any info rejecting it as a method either, which led me to post here. Essentially, this was a difference of opinion on approach, and I was looking to gather information in order to make an informed opposing argument. 


My only thought now would be how could I decipher, within the sequence chain of WAL files, up to which file has the "archival" progressed to. i.e. which files are not susceptible to being called upon again for restartpoints/checkpoints. That is, where is my absolute point (or file) of archival using something along the lines of 'pg_current_xlog_location'. 


Regards,

Ruan


From: Stephen Frost <sfrost@snowman.net>
Sent: 30 October 2017 16:41:11
To: Rhhh Lin
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Backup strategy using 'wal_keep_segments'
 
Greetings,

* Rhhh Lin (ruanlinehan@hotmail.com) wrote:
> A colleague recently suggested that instead of implementing an 'archive_command' to push archivable WALs to a secondary location (for further backup to tape for example), we could instead persist the WAL files in their current location by setting the "wal_keep_segments" parameter to an extreme value e.g. 1000 and have the 'archive_command' do nothing.

Michael's points are good and I wouldn't recommend using this archive
command either, but what isn't clear to me is what you're actaully
trying to solve by using such a method..?  You haven't said anywhere
what's wrong with archive_command (I know that there certainly are some
things wrong with it, of course, but there are solutions to a number of
those issues that isn't a hack like this ...).

Thanks!

Stephen

Re: [GENERAL] Backup strategy using 'wal_keep_segments'

From
Michael Paquier
Date:
On Tue, Oct 31, 2017 at 9:53 AM, Rhhh Lin <ruanlinehan@hotmail.com> wrote:
> I would actually be an advocate for using a proper archive_command in order
> to facilitate a proper (Per the documentation) PITR and backup strategy.

You should avoid using your own fancy archive command. There are
things that WAL-E for this purpose, and even if you do make sure that
the segment data is flushed to disk before sending back 0 as exit code
to the client.

> However, a colleague had suggested such a creative approach (Possibly, less
> administrative overhead, ease of maintenance? I'm not sure) and I struggled
> to find any evidence online in blogs/white-papers/documentation that this
> was a feasible approach. That said, I couldn't find any info rejecting it as
> a method either, which led me to post here. Essentially, this was a
> difference of opinion on approach, and I was looking to gather information
> in order to make an informed opposing argument.

Backup methods should not be fancy, just reliable.

> My only thought now would be how could I decipher, within the sequence chain
> of WAL files, up to which file has the "archival" progressed to. i.e. which
> files are not susceptible to being called upon again for
> restartpoints/checkpoints. That is, where is my absolute point (or file) of
> archival using something along the lines of 'pg_current_xlog_location'.

There is the system view called pg_stat_archiver for that. If you are
using something like pg_receivexlog, pg_stat_replication is the way to
go to monitor the archiving progress.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Backup strategy using 'wal_keep_segments'

From
Stephen Frost
Date:
Greetings,

* Rhhh Lin (ruanlinehan@hotmail.com) wrote:
> I would actually be an advocate for using a proper archive_command in order to facilitate a proper (Per the
documentation)PITR and backup strategy.
 

Glad to hear it.

> However, a colleague had suggested such a creative approach (Possibly, less administrative overhead, ease of
maintenance?I'm not sure) and I struggled to find any evidence online in blogs/white-papers/documentation that this was
afeasible approach. That said, I couldn't find any info rejecting it as a method either, which led me to post here.
Essentially,this was a difference of opinion on approach, and I was looking to gather information in order to make an
informedopposing argument.
 

I'd strongly suggest considering tried-and-true approachs instead of
coming up with something novel, particularly when it comes to backups
and recovery.

Thanks!

Stephen