Thread: PITR Recovery Question
Hi, My production server is running PostgreSQL v8.2.3 on CentOS release 5.2 (Final). I've setup PITR in my production server. For some reason, after setting up PITR, we're not able to manage and maintain it. Because of this our WAL archive drive become full (100% use) approximately after 1 month. PITR SETUP DETAILS We've 2 drives. Primary drive (pgsql/data/ directory resides) is 400 GB and secondary drive (WAL archive) is 30 GB. All WAL archives are written to secondary drive. Base backup taken on: Aug03, 2009 WAL archive drive become full (100% use) on: Sep05, 2009 Because this WAL archive drive has become full, all WAL archive segments to be archived are accumulated into pg_xlog/ directory itself. Eventually, 9 months (as of today from Sep05, 2009) of WAL archives are residing in pg_xlog/ directory. My question is, in case if I would like to perform recovery process as it is in this situation, will this work out? That is, I'm seeing/finding out whether recovery process would perform successfully anywhere between the date range Aug03, 2009 (my base backup date) and as of today - Jun03, 2009. Reason I'm asking this is still all my WAL archives are residing in pg_xlog/ directory. Experts advice/idea/suggestion on this appreciated. Regards, Gnanam
Hi, I'll try to answer your questions below, but in the future please post questions concerning the usage and administrationof postgres to pgsql-general or pgsql-admin. This list focus is the development of new features and bugfixes. On Jun 3, 2010, at 15:37 , Gnanakumar wrote: > PITR SETUP DETAILS > We've 2 drives. Primary drive (pgsql/data/ directory resides) is 400 GB and > secondary drive (WAL archive) is 30 GB. All WAL archives are written to > secondary drive. > > Base backup taken on: Aug03, 2009 > WAL archive drive become full (100% use) on: Sep05, 2009 > > Because this WAL archive drive has become full, all WAL archive segments to > be archived are accumulated into pg_xlog/ directory itself. Eventually, 9 > months (as of today from Sep05, 2009) of WAL archives are residing in > pg_xlog/ directory. This is by design. WAL logs are only removed from pg_xlog once they have been archived successfully. Since your archive_commandfails due to the disk being full, they remain in pg_xlog. Once you enlarge the filesystem holding the WALarchive they should be copied and subsequently removed from pg_xlog. Note that you'd usually take a new base backup once in a while to limit the number of WAL segments you need to retain. Youcan take a base backup while postgres is running by issuing pg_start_backup() before you start the copy and pg_stop_backup()after it finished. Apart from creating additional IO load, doing so won't interfere with normal query executionin any way. > My question is, in case if I would like to perform recovery process as it is > in this situation, will this work out? That is, I'm seeing/finding out > whether recovery process would perform successfully anywhere between the > date range Aug03, 2009 (my base backup date) and as of today - Jun03, 2009. > Reason I'm asking this is still all my WAL archives are residing in pg_xlog/ > directory. For PITR, you'll obviously need the WAL segment starting from the time your base backup started up until the point you wantto recover to. If some of those WAL segments still reside in pg_xlog, you'll either need to teach your restore_commandto fetch them from there. Note that you cannot recover "in reverse". To recover up to a certain point in timeyou always need to start from a base backup taken *before* that time. best regards, Florian Pflug
Hi Florian, Sure. I'll post my future questions either in pgsql-general or pgsql-admin. > If some of those WAL segments still reside in pg_xlog, you'll either need to teach your restore_command to fetch them from there. Note that you cannot recover "in reverse". My pg_xlog/ and walarchive/ directory locations are "/usr/local/pgsql/data/pg_xlog" and "/mnt/pitr/walarchive" respectively. If my normal restore command is: restore_command='cp "/mnt/pitr/walarchive/%f "%p"', how should I instruct restore command to fetch? Should I just replace this with something like restore_command='cp /usr/local/pgsql/data/pg_xlog/%f "%p"'. Also you have mentioned that we cannot recover "in reverse", what I understand from this is that even though if I replace the restore command pointing to pg_xlog/ directory, this will not work out in this situation? Is my understanding right? -----Original Message----- From: Florian Pflug [mailto:fgp@phlo.org] Sent: Thursday, June 03, 2010 8:50 PM To: gnanam@zoniac.com Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] PITR Recovery Question Hi, I'll try to answer your questions below, but in the future please post questions concerning the usage and administration of postgres to pgsql-general or pgsql-admin. This list focus is the development of new features and bugfixes. On Jun 3, 2010, at 15:37 , Gnanakumar wrote: > PITR SETUP DETAILS > We've 2 drives. Primary drive (pgsql/data/ directory resides) is 400 GB and > secondary drive (WAL archive) is 30 GB. All WAL archives are written to > secondary drive. > > Base backup taken on: Aug03, 2009 > WAL archive drive become full (100% use) on: Sep05, 2009 > > Because this WAL archive drive has become full, all WAL archive segments to > be archived are accumulated into pg_xlog/ directory itself. Eventually, 9 > months (as of today from Sep05, 2009) of WAL archives are residing in > pg_xlog/ directory. This is by design. WAL logs are only removed from pg_xlog once they have been archived successfully. Since your archive_command fails due to the disk being full, they remain in pg_xlog. Once you enlarge the filesystem holding the WAL archive they should be copied and subsequently removed from pg_xlog. Note that you'd usually take a new base backup once in a while to limit the number of WAL segments you need to retain. You can take a base backup while postgres is running by issuing pg_start_backup() before you start the copy and pg_stop_backup() after it finished. Apart from creating additional IO load, doing so won't interfere with normal query execution in any way. > My question is, in case if I would like to perform recovery process as it is > in this situation, will this work out? That is, I'm seeing/finding out > whether recovery process would perform successfully anywhere between the > date range Aug03, 2009 (my base backup date) and as of today - Jun03, 2009. > Reason I'm asking this is still all my WAL archives are residing in pg_xlog/ > directory. For PITR, you'll obviously need the WAL segment starting from the time your base backup started up until the point you want to recover to. If some of those WAL segments still reside in pg_xlog, you'll either need to teach your restore_command to fetch them from there. Note that you cannot recover "in reverse". To recover up to a certain point in time you always need to start from a base backup taken *before* that time. best regards, Florian Pflug
On Jun 4, 2010, at 7:05 , Gnanakumar wrote: >> If some of those WAL segments still reside in pg_xlog, you'll either need > to teach your restore_command to fetch them from there. Note that you cannot > recover "in reverse". > > My pg_xlog/ and walarchive/ directory locations are > "/usr/local/pgsql/data/pg_xlog" and "/mnt/pitr/walarchive" respectively. > > If my normal restore command is: restore_command='cp > "/mnt/pitr/walarchive/%f "%p"', how should I instruct restore command to > fetch? Should I just replace this with something like restore_command='cp > /usr/local/pgsql/data/pg_xlog/%f "%p"'. Also you have mentioned that we > cannot recover "in reverse", what I understand from this is that even though > if I replace the restore command pointing to pg_xlog/ directory, this will > not work out in this situation? Is my understanding right? If you point it at a cluster's own pg_xlog directory, it won't work. You might want to re-ead the section on the recovery process in the PTITR documentation, at http://www.postgresql.org/docs/8.4/interactive/continuous-archiving.html#BACKUP-PITR-RECOVERY If you have further questions, please take this discussion to pgsql-general. best regards, Florian Pflug