Thread: How to perform PITR when all of the logs won't fit on the drive

How to perform PITR when all of the logs won't fit on the drive

From
"Tony Sullivan"
Date:
Hello,

I have a situation where something was deleted from a database that
shouldn't have been so I am having to take a base backup and perform a
point-in-time-recovery. The problem I have is that the decompressed WAL
files will not fit on the drive of the machine I am trying to do the
restore on.

I am wondering if I can arrange the WAL files by date and copy them to the
directory where they belong and then copy another batch when those are
restored or if I will need to find some other way of performing the
recovery.





Re: How to perform PITR when all of the logs won't fit on the drive

From
Alan Hodgson
Date:
On Thu, 2018-03-01 at 17:28 -0500, Tony Sullivan wrote:
Hello, I have a situation where something was deleted from a database that shouldn't have been so I am having to take a base backup and perform a point-in-time-recovery. The problem I have is that the decompressed WAL files will not fit on the drive of the machine I am trying to do the restore on.

Your restore_command can be a complex command or shell script that can transfer WAL files as needed from a network source.

Re: How to perform PITR when all of the logs won't fit on the drive

From
Jeff Janes
Date:
On Thu, Mar 1, 2018 at 2:28 PM, Tony Sullivan <tony@exquisiteimages.com> wrote:
Hello,

I have a situation where something was deleted from a database that
shouldn't have been so I am having to take a base backup and perform a
point-in-time-recovery. The problem I have is that the decompressed WAL
files will not fit on the drive of the machine I am trying to do the
restore on.

I am wondering if I can arrange the WAL files by date and copy them to the
directory where they belong and then copy another batch when those are
restored or if I will need to find some other way of performing the
recovery.


You could do that if your timing is right.  You need to either make sure the next batch shows up before the first file in that batch is requested, or have you restore command wait and retry rather than throw an error when it asks for a file that does not exist.  But your restore command can copy them from a network drive, or remotely with scp or rsync, and also decompress them on the fly.  That seems simpler.     

Another problem you might run into is that the restored WAL records are retained in pg_xlog for two restart points before being removed, and until fairly recent versions of PostgreSQL restart points were governed only by checkpoint_timeout, and not by WAL volume.  So if your system restores far faster than it took to generate the WAL in the first place, this could lead to massive amounts of WAL kept in pg_xlog running you out of disk space.  So you should lower checkpoint_timeout for recovery to be much less than it was in production.

Cheers,

Jeff