Thread: PITR Recovery Question

PITR Recovery Question

From
"Gnanakumar"
Date:
Hi,

My production server is running PostgreSQL v8.2.3 on CentOS release 5.2
(Final).

I've setup PITR in my production server.  For some reason, after setting up
PITR, we're not able to manage and maintain it.  Because of this our WAL
archive drive become full (100% use) approximately after 1 month.

PITR SETUP DETAILS
We've 2 drives.  Primary drive (pgsql/data/ directory resides) is 400 GB and
secondary drive (WAL archive) is 30 GB.  All WAL archives are written to
secondary drive.

Base backup taken on: Aug03, 2009
WAL archive drive become full (100% use) on: Sep05, 2009

Because this WAL archive drive has become full, all WAL archive segments to
be archived are accumulated into pg_xlog/ directory itself.  Eventually, 9
months (as of today from Sep05, 2009) of WAL archives are residing in
pg_xlog/ directory.

My question is, in case if I would like to perform recovery process as it is
in this situation, will this work out?  That is, I'm seeing/finding out
whether recovery process would perform successfully anywhere between the
date range Aug03, 2009 (my base backup date) and as of today - Jun03, 2009.
Reason I'm asking this is still all my WAL archives are residing in pg_xlog/
directory.

Experts advice/idea/suggestion on this appreciated.

Regards,
Gnanam



Re: PITR Recovery Question

From
Florian Pflug
Date:
Hi,

I'll try to answer your questions below, but in the future please post questions concerning the usage and
administrationof postgres to pgsql-general or pgsql-admin. This list focus is the development of new features and
bugfixes.

On Jun 3, 2010, at 15:37 , Gnanakumar wrote:
> PITR SETUP DETAILS
> We've 2 drives.  Primary drive (pgsql/data/ directory resides) is 400 GB and
> secondary drive (WAL archive) is 30 GB.  All WAL archives are written to
> secondary drive.
>
> Base backup taken on: Aug03, 2009
> WAL archive drive become full (100% use) on: Sep05, 2009
>
> Because this WAL archive drive has become full, all WAL archive segments to
> be archived are accumulated into pg_xlog/ directory itself.  Eventually, 9
> months (as of today from Sep05, 2009) of WAL archives are residing in
> pg_xlog/ directory.

This is by design. WAL logs are only removed from pg_xlog once they have been archived successfully. Since your
archive_commandfails due to the disk being full, they remain in pg_xlog. Once you enlarge the filesystem holding the
WALarchive they should be copied and subsequently removed from pg_xlog. 

Note that you'd usually take a new base backup once in a while to limit the number of WAL segments you need to retain.
Youcan take a base backup while postgres is running by issuing pg_start_backup() before you start the copy and
pg_stop_backup()after it finished. Apart from creating additional IO load, doing so won't interfere with normal query
executionin any way. 

> My question is, in case if I would like to perform recovery process as it is
> in this situation, will this work out?  That is, I'm seeing/finding out
> whether recovery process would perform successfully anywhere between the
> date range Aug03, 2009 (my base backup date) and as of today - Jun03, 2009.
> Reason I'm asking this is still all my WAL archives are residing in pg_xlog/
> directory.

For PITR, you'll obviously need the WAL segment starting from the time your base backup started up until the point you
wantto recover to. If some of those WAL segments still reside in pg_xlog, you'll either need to teach your
restore_commandto fetch them from there. Note that you cannot recover "in reverse". To recover up to a certain point in
timeyou always need to start from a base backup taken *before* that time. 

best regards,
Florian Pflug




Re: PITR Recovery Question

From
"Gnanakumar"
Date:
Hi Florian,

Sure.  I'll post my future questions either in pgsql-general or pgsql-admin.

> If some of those WAL segments still reside in pg_xlog, you'll either need
to teach your restore_command to fetch them from there. Note that you cannot
recover "in reverse".

My pg_xlog/ and walarchive/ directory locations are
"/usr/local/pgsql/data/pg_xlog" and "/mnt/pitr/walarchive" respectively.

If my normal restore command is: restore_command='cp
"/mnt/pitr/walarchive/%f "%p"', how should I instruct restore command to
fetch?  Should I just replace this with something like restore_command='cp
/usr/local/pgsql/data/pg_xlog/%f "%p"'.  Also you have mentioned that we
cannot recover "in reverse", what I understand from this is that even though
if I replace the restore command pointing to pg_xlog/ directory, this will
not work out in this situation?  Is my understanding right?

-----Original Message-----
From: Florian Pflug [mailto:fgp@phlo.org] 
Sent: Thursday, June 03, 2010 8:50 PM
To: gnanam@zoniac.com
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] PITR Recovery Question

Hi,

I'll try to answer your questions below, but in the future please post
questions concerning the usage and administration of postgres to
pgsql-general or pgsql-admin. This list focus is the development of new
features and bugfixes.

On Jun 3, 2010, at 15:37 , Gnanakumar wrote:
> PITR SETUP DETAILS
> We've 2 drives.  Primary drive (pgsql/data/ directory resides) is 400 GB
and
> secondary drive (WAL archive) is 30 GB.  All WAL archives are written to
> secondary drive.
> 
> Base backup taken on: Aug03, 2009
> WAL archive drive become full (100% use) on: Sep05, 2009
> 
> Because this WAL archive drive has become full, all WAL archive segments
to
> be archived are accumulated into pg_xlog/ directory itself.  Eventually, 9
> months (as of today from Sep05, 2009) of WAL archives are residing in
> pg_xlog/ directory.

This is by design. WAL logs are only removed from pg_xlog once they have
been archived successfully. Since your archive_command fails due to the disk
being full, they remain in pg_xlog. Once you enlarge the filesystem holding
the WAL archive they should be copied and subsequently removed from pg_xlog.

Note that you'd usually take a new base backup once in a while to limit the
number of WAL segments you need to retain. You can take a base backup while
postgres is running by issuing pg_start_backup() before you start the copy
and pg_stop_backup() after it finished. Apart from creating additional IO
load, doing so won't interfere with normal query execution in any way.

> My question is, in case if I would like to perform recovery process as it
is
> in this situation, will this work out?  That is, I'm seeing/finding out
> whether recovery process would perform successfully anywhere between the
> date range Aug03, 2009 (my base backup date) and as of today - Jun03,
2009.
> Reason I'm asking this is still all my WAL archives are residing in
pg_xlog/
> directory.

For PITR, you'll obviously need the WAL segment starting from the time your
base backup started up until the point you want to recover to. If some of
those WAL segments still reside in pg_xlog, you'll either need to teach your
restore_command to fetch them from there. Note that you cannot recover "in
reverse". To recover up to a certain point in time you always need to start
from a base backup taken *before* that time.

best regards,
Florian Pflug





Re: PITR Recovery Question

From
Florian Pflug
Date:
On Jun 4, 2010, at 7:05 , Gnanakumar wrote:
>> If some of those WAL segments still reside in pg_xlog, you'll either need
> to teach your restore_command to fetch them from there. Note that you cannot
> recover "in reverse".
>
> My pg_xlog/ and walarchive/ directory locations are
> "/usr/local/pgsql/data/pg_xlog" and "/mnt/pitr/walarchive" respectively.
>
> If my normal restore command is: restore_command='cp
> "/mnt/pitr/walarchive/%f "%p"', how should I instruct restore command to
> fetch?  Should I just replace this with something like restore_command='cp
> /usr/local/pgsql/data/pg_xlog/%f "%p"'.  Also you have mentioned that we
> cannot recover "in reverse", what I understand from this is that even though
> if I replace the restore command pointing to pg_xlog/ directory, this will
> not work out in this situation?  Is my understanding right?

If you point it at a cluster's own pg_xlog directory, it won't work.

You might want to re-ead the section on the recovery process in the PTITR documentation, at
http://www.postgresql.org/docs/8.4/interactive/continuous-archiving.html#BACKUP-PITR-RECOVERY

If you have further questions, please take this discussion to pgsql-general.

best regards,
Florian Pflug