Thread: PITR
All,
I am testing PITR…. I am looking for recovery.conf parameters where you can recovery the WALs available in the restore_command but do not complete recovery. I want to be able to connect to the database and check database in read only and if I am not there yet, I will feed more WALs in the archive directory and resume recovery. I would like to prevent multiple base restorations. I want to roll forward with WALs but check in between. Also, I do not want to set up replication standby.
Is it possible? If so, could you tell me what are the relevant recovery.conf params?
Thanks,
Murthy
All,
I am testing PITR…. I am looking for recovery.conf parameters where you can recovery the WALs available in the restore_command but do not complete recovery. I want to be able to connect to the database and check database in read only and if I am not there yet, I will feed more WALs in the archive directory and resume recovery. I would like to prevent multiple base restorations. I want to roll forward with WALs but check in between. Also, I do not want to set up replication standby.
Is it possible? If so, could you tell me what are the relevant recovery.conf params?
Thanks,
Murthy
look at parameter "pause_at_recovery_target" and "recovery_target_time" with these parameters in the recovery.conf you could reach a point in the timeline to recover the database and then see if it's all ok and then finish the recovery with pg_xlog_replay_resume. If you want instead continue the recovery you have to stop the backend and change the parameter of recovery_target_time in the recovery.conf and restart the postmaster.
Mat,
Thank you for your quick response….
The documentation says for pause_at_recovery_target:
This setting has no effect if hot_standby is not enabled, or if no recovery target is set.
In my case hot_standby is not enabled.
Thanks,
Murthy
From: desmodemone [mailto:desmodemone@gmail.com]
Sent: Saturday, February 22, 2014 10:34 AM
To: Murthy Nunna
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] PITR
2014-02-22 17:06 GMT+01:00 Murthy Nunna <mnunna@fnal.gov>:
All,
I am testing PITR…. I am looking for recovery.conf parameters where you can recovery the WALs available in the restore_command but do not complete recovery. I want to be able to connect to the database and check database in read only and if I am not there yet, I will feed more WALs in the archive directory and resume recovery. I would like to prevent multiple base restorations. I want to roll forward with WALs but check in between. Also, I do not want to set up replication standby.
Is it possible? If so, could you tell me what are the relevant recovery.conf params?
Thanks,
Murthy
Hi Murthy,
look at parameter "pause_at_recovery_target" and "recovery_target_time" with these parameters in the recovery.conf you could reach a point in the timeline to recover the database and then see if it's all ok and then finish the recovery with pg_xlog_replay_resume. If you want instead continue the recovery you have to stop the backend and change the parameter of recovery_target_time in the recovery.conf and restart the postmaster.
Mat DBA
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Le 22/02/2014 18:03, Murthy Nunna a écrit : > Mat, > > > > Thank you for your quick response…. > > > > The documentation says for pause_at_recovery_target: > > > > This setting has no effect if hot_standby > <http://www.postgresql.org/docs/9.2/static/runtime-config-replication.html#GUC-HOT-STANDBY> > is not enabled, or if no recovery target is set. > > > > In my case hot_standby is not enabled. > > Hi, If you want to connect to your database in read only to check the recovery point, you have to enable it. It doesn't have any impact when the cluster is not in recovery mode. Regards. > > Thanks, > > Murthy > > > > > > > > > > *From:*desmodemone [mailto:desmodemone@gmail.com] *Sent:* Saturday, > February 22, 2014 10:34 AM *To:* Murthy Nunna *Cc:* > pgsql-admin@postgresql.org *Subject:* Re: [ADMIN] PITR > > > > > > > > 2014-02-22 17:06 GMT+01:00 Murthy Nunna <mnunna@fnal.gov > <mailto:mnunna@fnal.gov>>: > > All, > > > > I am testing PITR…. I am looking for recovery.conf parameters where > you can recovery the WALs available in the restore_command but do > not complete recovery. I want to be able to connect to the database > and check database in read only and if I am not there yet, I will > feed more WALs in the archive directory and resume recovery. I > would like to prevent multiple base restorations. I want to roll > forward with WALs but check in between. Also, I do not want to set > up replication standby. > > > > Is it possible? If so, could you tell me what are the relevant > recovery.conf params? > > > > Thanks, > > Murthy > > > > Hi Murthy, look at parameter "pause_at_recovery_target" and > "recovery_target_time" with these parameters in the recovery.conf > you could reach a point in the timeline to recover the database and > then see if it's all ok and then finish the recovery with > pg_xlog_replay_resume. If you want instead continue the recovery > you have to stop the backend and change the parameter of > recovery_target_time in the recovery.conf and restart the > postmaster. > > Mat DBA > > > - -- Julien Rouhaud http://dalibo.com - http://dalibo.org -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTCN8GAAoJELGaJ8vfEpOqynAH/1Vu2gUwdDww26qbbDreXRid kpWGUEI2YXBw4e6D3SFiFDG77aPwFF7aGDXs/3Rjck3erLaYrtj/70ZFEwXJ2d4K W2hDS8KFt8cx6YsNMI2epL/FnDYZKudU2Qcceixlzf2gGSDeexyy/ZLdTOQMgXZF D4ktuEOdpIxUDipbWe7af2TVzaiTqHEu64RtcmtgPWIlwJHdtShEYFo6Go0BryY9 ToU+8DU45x4tuH6msk94D/7/NQJnkMpPNztxAXXZsDp4Xwtrq3nA20187y69NVu8 uWSpJVFzfPZIV0Cl6vupyw2k6G9ObvTBWcpSPrixVjgcpiHh+15j49z+OTfSB1I= =FR3m -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Le 22/02/2014 18:03, Murthy Nunna a écrit :> Mat,> <http://www.postgresql.org/docs/9.2/static/runtime-config-replication.html#GUC-HOT-STANDBY>
>
>
>
> Thank you for your quick response….
>
>
>
> The documentation says for pause_at_recovery_target:
>
>
>
> This setting has no effect if hot_standby> is not enabled, or if no recovery target is set.Hi,
>
>
>
> In my case hot_standby is not enabled.
>
>
If you want to connect to your database in read only to check the
recovery point, you have to enable it. It doesn't have any impact when
the cluster is not in recovery mode.
Regards.
>
> Thanks,
>
> Murthy
>
>
>
>
>
>
>
>
>
> *From:*desmodemone [mailto:desmodemone@gmail.com] *Sent:* Saturday,
> February 22, 2014 10:34 AM *To:* Murthy Nunna *Cc:*
> pgsql-admin@postgresql.org *Subject:* Re: [ADMIN] PITR
> <mailto:mnunna@fnal.gov>>:>- --
> All,
>
>
>
> I am testing PITR…. I am looking for recovery.conf parameters where
> you can recovery the WALs available in the restore_command but do
> not complete recovery. I want to be able to connect to the database
> and check database in read only and if I am not there yet, I will
> feed more WALs in the archive directory and resume recovery. I
> would like to prevent multiple base restorations. I want to roll
> forward with WALs but check in between. Also, I do not want to set
> up replication standby.
>
>
>
> Is it possible? If so, could you tell me what are the relevant
> recovery.conf params?
>
>
>
> Thanks,
>
> Murthy
>
>
>
> Hi Murthy, look at parameter "pause_at_recovery_target" and
> "recovery_target_time" with these parameters in the recovery.conf
> you could reach a point in the timeline to recover the database and
> then see if it's all ok and then finish the recovery with
> pg_xlog_replay_resume. If you want instead continue the recovery
> you have to stop the backend and change the parameter of
> recovery_target_time in the recovery.conf and restart the
> postmaster.
>
> Mat DBA
>
>
>
Julien Rouhaud
http://dalibo.com - http://dalibo.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQEcBAEBAgAGBQJTCN8GAAoJELGaJ8vfEpOqynAH/1Vu2gUwdDww26qbbDreXRid
kpWGUEI2YXBw4e6D3SFiFDG77aPwFF7aGDXs/3Rjck3erLaYrtj/70ZFEwXJ2d4K
W2hDS8KFt8cx6YsNMI2epL/FnDYZKudU2Qcceixlzf2gGSDeexyy/ZLdTOQMgXZF
D4ktuEOdpIxUDipbWe7af2TVzaiTqHEu64RtcmtgPWIlwJHdtShEYFo6Go0BryY9
ToU+8DU45x4tuH6msk94D/7/NQJnkMpPNztxAXXZsDp4Xwtrq3nA20187y69NVu8
uWSpJVFzfPZIV0Cl6vupyw2k6G9ObvTBWcpSPrixVjgcpiHh+15j49z+OTfSB1I=
=FR3m
-----END PGP SIGNATURE-----
--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Hi Mat,
Thank you for the pointers on pause_at_recovery_target and recovery_target_time. It worked but I encountered an unexpected situation.
I wanted to test three recovery times by checking data at each point and then proceed to the next. It worked as expected first 2 recovery times but the last one did not give me an opportunity to check data. It simply completed recovery and switched timeline. This means I cannot rollforward anymore unless I restore the database again. Do you think I did something wrong?
Thanks,
Murthy
Following are my recovery.conf settings:
pause_at_recovery_target = true
#recovery_target_time = '2014-02-22 19:15:00'
#recovery_target_time = '2014-02-22 19:44:00'
recovery_target_time = '2014-02-22 19:50:00'
restore_command = 'cp /pgdata/backups/xlogs/minerva_ecl_test/%f %p'
Following is the pg_log of my 3rd recovery:
,2014-02-22 22:04:16 CSTLOG: database system was shut down in recovery at 2014-02-22 22:02:42 CST
,2014-02-22 22:04:16 CSTLOG: restored log file "00000004.history" from archive
,2014-02-22 22:04:16 CSTLOG: starting point-in-time recovery to 2014-02-22 19:50:00-06
,2014-02-22 22:04:16 CSTLOG: restored log file "000000040000000800000004" from archive
,2014-02-22 22:04:16 CSTLOG: redo starts at 8/4000020
,2014-02-22 22:04:16 CSTLOG: restored log file "000000040000000800000005" from archive
,2014-02-22 22:04:16 CSTLOG: consistent recovery state reached at 8/5F51EA8
,2014-02-22 22:04:16 CSTLOG: database system is ready to accept read only connections
,2014-02-22 22:04:16 CSTLOG: restored log file "000000040000000800000006" from archive
,2014-02-22 22:04:16 CSTLOG: restored log file "000000040000000800000007" from archive
,2014-02-22 22:04:16 CSTLOG: restored log file "000000040000000800000008" from archive
,2014-02-22 22:04:16 CSTLOG: restored log file "000000040000000800000009" from archive
,2014-02-22 22:04:16 CSTLOG: restored log file "00000004000000080000000A" from archive
,2014-02-22 22:04:16 CSTLOG: restored log file "00000004000000080000000B" from archive
,2014-02-22 22:04:16 CSTLOG: restored log file "00000004000000080000000C" from archive
cp: cannot stat `/pgdata/backups/xlogs/minerva_ecl_test/00000004000000080000000D': No such file or directory
,2014-02-22 22:04:16 CSTLOG: unexpected pageaddr 7/BC000000 in log file 8, segment 13, offset 0
,2014-02-22 22:04:16 CSTLOG: redo done at 8/C0000B8
,2014-02-22 22:04:16 CSTLOG: last completed transaction was at log time 2014-02-22 19:48:29.205898-06
,2014-02-22 22:04:16 CSTLOG: restored log file "00000004000000080000000C" from archive
cp: cannot stat `/pgdata/backups/xlogs/minerva_ecl_test/00000005.history': No such file or directory
cp: cannot stat `/pgdata/backups/xlogs/minerva_ecl_test/00000006.history': No such file or directory
cp: cannot stat `/pgdata/backups/xlogs/minerva_ecl_test/00000007.history': No such file or directory
cp: cannot stat `/pgdata/backups/xlogs/minerva_ecl_test/00000008.history': No such file or directory
cp: cannot stat `/pgdata/backups/xlogs/minerva_ecl_test/00000009.history': No such file or directory
cp: cannot stat `/pgdata/backups/xlogs/minerva_ecl_test/0000000A.history': No such file or directory
,2014-02-22 22:04:16 CSTLOG: selected new timeline ID: 10
,2014-02-22 22:04:16 CSTLOG: restored log file "00000004.history" from archive
,2014-02-22 22:04:17 CSTLOG: archive recovery complete
,2014-02-22 22:04:17 CSTLOG: database system is ready to accept connections
,2014-02-22 22:04:17 CSTLOG: autovacuum launcher started
From: desmodemone [mailto:desmodemone@gmail.com]
Sent: Saturday, February 22, 2014 10:34 AM
To: Murthy Nunna
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] PITR
2014-02-22 17:06 GMT+01:00 Murthy Nunna <mnunna@fnal.gov>:
All,
I am testing PITR…. I am looking for recovery.conf parameters where you can recovery the WALs available in the restore_command but do not complete recovery. I want to be able to connect to the database and check database in read only and if I am not there yet, I will feed more WALs in the archive directory and resume recovery. I would like to prevent multiple base restorations. I want to roll forward with WALs but check in between. Also, I do not want to set up replication standby.
Is it possible? If so, could you tell me what are the relevant recovery.conf params?
Thanks,
Murthy
Hi Murthy,
look at parameter "pause_at_recovery_target" and "recovery_target_time" with these parameters in the recovery.conf you could reach a point in the timeline to recover the database and then see if it's all ok and then finish the recovery with pg_xlog_replay_resume. If you want instead continue the recovery you have to stop the backend and change the parameter of recovery_target_time in the recovery.conf and restart the postmaster.
Mat DBA
Hi Mat,
Thank you for the pointers on pause_at_recovery_target and recovery_target_time. It worked but I encountered an unexpected situation.
I wanted to test three recovery times by checking data at each point and then proceed to the next. It worked as expected first 2 recovery times but the last one did not give me an opportunity to check data. It simply completed recovery and switched timeline. This means I cannot rollforward anymore unless I restore the database again. Do you think I did something wrong?
,2014-02-22 22:04:16 CSTLOG: restored log file "00000004000000080000000C" from archive
cp: cannot stat `/pgdata/backups/xlogs/minerva_ecl_test/00000004000000080000000D': No such file or directory
Hi Mat,
Thank you for the pointers on pause_at_recovery_target and recovery_target_time. It worked but I encountered an unexpected situation.
I wanted to test three recovery times by checking data at each point and then proceed to the next. It worked as expected first 2 recovery times but the last one did not give me an opportunity to check data. It simply completed recovery and switched timeline. This means I cannot rollforward anymore unless I restore the database again. Do you think I did something wrong?
You did right, for the first two time targets you have some wal files pending hence pause is allowed. Whereas for the last time target there are no archive files to pause so it has just opened the database as completion of PITR.I am able to reproduce this on my local as well.This is from my logs:2014-02-06 18:54:54 PST-11634---[] LOG: restored log file "0000000100000001000000B8" from archive2014-02-06 18:54:55 PST-11634---[] LOG: restored log file "0000000100000001000000B9" from archive2014-02-06 18:54:55 PST-11634---[] LOG: consistent recovery state reached at 1/B90029C82014-02-06 18:54:55 PST-11632---[] LOG: database system is ready to accept read only connectionscp: cannot stat `/opt/PostgreSQL/9.3/archives93/0000000100000001000000BA': No such file or directoryB8,B9 applied and looking for next file BA which is not there and no point of pausing without any files, hence it has opened the database. But my guess is, what it has done is right, there no more files to pause and allow you to query, its like completion of PITR.Same in your case, there may not be any files in your archives directory. You can check for '00000004000000080000000D' file in your archive directory.,2014-02-22 22:04:16 CSTLOG: restored log file "00000004000000080000000C" from archive
cp: cannot stat `/pgdata/backups/xlogs/minerva_ecl_test/00000004000000080000000D': No such file or directory
Raghavendra is right, the recovery applied successful your archived wal segment to the database and ended the loop in the xlog.c when it not found anymore records in the wal segments. Moreover the backend is telling you that the recovery phase ended at "2014-02-22 19:48:29.205898-06" while your target timeline was "2014-02-22 19:50:00" and it's interesting.
Raghavendra,
Thanks for testing and confirming the behavior of “pause” setting.
While I understand your explanation, I feel I am still missing something. IMHO, when I say pause using “pause” setting, no matter what, I expect the recovery to wait for manual intervention. I myself can come up with number of reasons for doing so… e.g I may be purposely “hiding” some WALs somewhere else, or maybe I have several thousands of WALs that I want to parallelize the process of applying some logs while I recall some from tapes.
Let me know what you think.
Mat, This is a test database so I purposely lowered checkpoint/archive timeouts. Thank-you, I’ll follow your advice for production systems to conserve space.
Thanks,
Murthy
From: desmodemone [mailto:desmodemone@gmail.com]
Sent: Sunday, February 23, 2014 4:13 AM
To: Raghavendra
Cc: Murthy Nunna; pgsql-admin@postgresql.org
Subject: Re: [ADMIN] PITR
2014-02-23 7:08 GMT+01:00 Raghavendra <raghavendra.rao@enterprisedb.com>:
On Sun, Feb 23, 2014 at 9:57 AM, Murthy Nunna <mnunna@fnal.gov> wrote:
Hi Mat,
Thank you for the pointers on pause_at_recovery_target and recovery_target_time. It worked but I encountered an unexpected situation.
I wanted to test three recovery times by checking data at each point and then proceed to the next. It worked as expected first 2 recovery times but the last one did not give me an opportunity to check data. It simply completed recovery and switched timeline. This means I cannot rollforward anymore unless I restore the database again. Do you think I did something wrong?
You did right, for the first two time targets you have some wal files pending hence pause is allowed. Whereas for the last time target there are no archive files to pause so it has just opened the database as completion of PITR.
I am able to reproduce this on my local as well.
This is from my logs:
2014-02-06 18:54:54 PST-11634---[] LOG: restored log file "0000000100000001000000B8" from archive
2014-02-06 18:54:55 PST-11634---[] LOG: restored log file "0000000100000001000000B9" from archive
2014-02-06 18:54:55 PST-11634---[] LOG: consistent recovery state reached at 1/B90029C8
2014-02-06 18:54:55 PST-11632---[] LOG: database system is ready to accept read only connections
cp: cannot stat `/opt/PostgreSQL/9.3/archives93/0000000100000001000000BA': No such file or directory
B8,B9 applied and looking for next file BA which is not there and no point of pausing without any files, hence it has opened the database. But my guess is, what it has done is right, there no more files to pause and allow you to query, its like completion of PITR.
Same in your case, there may not be any files in your archives directory. You can check for '00000004000000080000000D' file in your archive directory.
,2014-02-22 22:04:16 CSTLOG: restored log file "00000004000000080000000C" from archive
cp: cannot stat `/pgdata/backups/xlogs/minerva_ecl_test/00000004000000080000000D': No such file or directory
Raghavendra is right, the recovery applied successful your archived wal segment to the database and ended the loop in the xlog.c when it not found anymore records in the wal segments. Moreover the backend is telling you that the recovery phase ended at "2014-02-22 19:48:29.205898-06" while your target timeline was "2014-02-22 19:50:00" and it's interesting.
Did you create transactions and you committed them at that time or after ?
Normally, when you have to do a restore, probably you had a crash and your filesystem is not ok, so you use a new filesystem on another storage. So your last transactions are "lost", because those transactions was not still archived, infact those transactions were in the last wal segment of the crashed filesystem. If that file is not OK, you could not use it for complete the recovery.
If you have a RPO policy , you have to look at parameter archive_timeout , so you are sure your wal segments will be archived, and if you do that and your database does not create so much transactions , think about to compress them or use data deduplication at storage level or filesystem level , or you will have a lot of wasted space ( the wal segment even if is not full will write 16Mb of <data>+<00000..000>)
Have a nice day
Mat Dba
While I understand your explanation, I feel I am still missing something. IMHO, when I say pause using “pause” setting, no matter what, I expect the recovery to wait for manual intervention.
Raghavendra,
Thanks for testing and confirming the behavior of “pause” setting.
While I understand your explanation, I feel I am still missing something. IMHO, when I say pause using “pause” setting, no matter what, I expect the recovery to wait for manual intervention.
I myself can come up with number of reasons for doing so… e.g I may be purposely “hiding” some WALs somewhere else, or maybe I have several thousands of WALs that I want to parallelize the process of applying some logs while I recall some from tapes.
Let me know what you think.
Hi Raghavendra,
I used standby_mode=on and it worked. I can put checkpoints (not database checkpoint J) in between and still be in recovery state. This is what I wanted.
Thanks for your help!
Murthy
From: Raghavendra [mailto:raghavendra.rao@enterprisedb.com]
Sent: Monday, February 24, 2014 6:31 AM
To: Murthy Nunna
Cc: desmodemone; pgsql-admin@postgresql.org
Subject: Re: [ADMIN] PITR
On Sun, Feb 23, 2014 at 8:50 PM, Murthy Nunna <mnunna@fnal.gov> wrote:
Raghavendra,
Thanks for testing and confirming the behavior of “pause” setting.
While I understand your explanation, I feel I am still missing something. IMHO, when I say pause using “pause” setting, no matter what, I expect the recovery to wait for manual intervention.
I very much agree with your point that it has to pause when you ask for it, however, as per design (some other might comment on this well) am guessing it will open the database if no wals are there though you intentionally hide them.
You can use (HOT STANDBY) standby_mode=on which does the same thing, it just waits for the WAL files but it won't open the database until you pass the trigger file. In hot standby, it apply the existing wals fed and wait for coming wals and it won't come out of recovery. This you can try with below link.
I myself can come up with number of reasons for doing so… e.g I may be purposely “hiding” some WALs somewhere else, or maybe I have several thousands of WALs that I want to parallelize the process of applying some logs while I recall some from tapes.
Let me know what you think.
Agreed it might be possible of not having wals at the moment and waiting for them to copy, however, I prefer in that case to use hot_standby. Pause just works in case if it sees some pending file in Arch.. location.
My explanation might not reach to your expectation, but I am sure few other's here might share their inputs.
--Raghav
All,
My database is in continuous recovery but it is not a traditional standby setup…. I am manually copying WALs (several thousands) from the source server to the server I am recovering. This means recovery is being done while WALs are being copied. It could so happen that a particular WAL could be open for recovery while it is still being copied (and not closed). Is this ok? I am asking because, I am seeing lots of LOG messages “unexpected pageaddr” as below.
I can do this differently if this method is not supported. I don’t want to end up with a corrupted database.
Thanks in advance for your advice.
2014-02-25 23:24:27 CST []LOG: unexpected pageaddr 15D1/6E000000 in log file 5587, segment 22, offset 0
cp: cannot stat `/data1/pg_archlogs/ifb_prd/00000003000015D300000016': No such file or directory
2014-02-25 23:24:31 CST []LOG: restored log file "00000003000015D300000016" from archive
2014-02-25 23:24:31 CST []LOG: restored log file "00000003000015D300000017" from archive
2014-02-25 23:24:32 CST []LOG: restored log file "00000003000015D300000018" from archive
2014-02-25 23:24:32 CST []LOG: restored log file "00000003000015D300000019" from archive
2014-02-25 23:24:32 CST []LOG: restored log file "00000003000015D30000001A" from archive
2014-02-25 23:24:32 CST []LOG: restored log file "00000003000015D30000001B" from archive
2014-02-25 23:24:32 CST []LOG: restored log file "00000003000015D30000001C" from archive
2014-02-25 23:24:32 CST []LOG: restored log file "00000003000015D30000001D" from archive
2014-02-25 23:24:32 CST []LOG: restored log file "00000003000015D30000001E" from archive
2014-02-25 23:24:33 CST []LOG: unexpected pageaddr 15D1/77000000 in log file 5587, segment 31, offset 0
2014-02-25 23:24:36 CST []LOG: restored log file "00000003000015D30000001F" from archive
2014-02-25 23:24:36 CST []LOG: restored log file "00000003000015D300000020" from archive
2014-02-25 23:24:36 CST []LOG: restored log file "00000003000015D300000021" from archive
2014-02-25 23:24:36 CST []LOG: restored log file "00000003000015D300000022" from archive
2014-02-25 23:24:36 CST []LOG: restored log file "00000003000015D300000023" from archive
2014-02-25 23:24:36 CST []LOG: restored log file "00000003000015D300000024" from archive
2014-02-25 23:24:36 CST []LOG: restored log file "00000003000015D300000025" from archive
2014-02-25 23:24:37 CST []LOG: unexpected pageaddr 15D1/7E000000 in log file 5587, segment 38, offset 0
2014-02-25 23:24:41 CST []LOG: restored log file "00000003000015D300000026" from archive
2014-02-25 23:24:41 CST []LOG: restored log file "00000003000015D300000027" from archive
2014-02-25 23:24:41 CST []LOG: restored log file "00000003000015D300000028" from archive
2014-02-25 23:24:41 CST []LOG: restored log file "00000003000015D300000029" from archive
2014-02-25 23:24:41 CST []LOG: restored log file "00000003000015D30000002A" from archive
2014-02-25 23:24:41 CST []LOG: restored log file "00000003000015D30000002B" from archive
2014-02-25 23:24:41 CST []LOG: restored log file "00000003000015D30000002C" from archive
2014-02-25 23:24:42 CST []LOG: restored log file "00000003000015D30000002D" from archive
2014-02-25 23:24:42 CST []LOG: restored log file "00000003000015D30000002E" from archive
2014-02-25 23:24:42 CST []LOG: unexpected pageaddr 15D1/26000000 in log file 5587, segment 47, offset 0
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Murthy Nunna
Sent: Monday, February 24, 2014 12:14 PM
To: Raghavendra
Cc: desmodemone; pgsql-admin@postgresql.org
Subject: Re: [ADMIN] PITR
Hi Raghavendra,
I used standby_mode=on and it worked. I can put checkpoints (not database checkpoint J) in between and still be in recovery state. This is what I wanted.
Thanks for your help!
Murthy
From: Raghavendra [mailto:raghavendra.rao@enterprisedb.com]
Sent: Monday, February 24, 2014 6:31 AM
To: Murthy Nunna
Cc: desmodemone; pgsql-admin@postgresql.org
Subject: Re: [ADMIN] PITR
On Sun, Feb 23, 2014 at 8:50 PM, Murthy Nunna <mnunna@fnal.gov> wrote:
Raghavendra,
Thanks for testing and confirming the behavior of “pause” setting.
While I understand your explanation, I feel I am still missing something. IMHO, when I say pause using “pause” setting, no matter what, I expect the recovery to wait for manual intervention.
I very much agree with your point that it has to pause when you ask for it, however, as per design (some other might comment on this well) am guessing it will open the database if no wals are there though you intentionally hide them.
You can use (HOT STANDBY) standby_mode=on which does the same thing, it just waits for the WAL files but it won't open the database until you pass the trigger file. In hot standby, it apply the existing wals fed and wait for coming wals and it won't come out of recovery. This you can try with below link.
I myself can come up with number of reasons for doing so… e.g I may be purposely “hiding” some WALs somewhere else, or maybe I have several thousands of WALs that I want to parallelize the process of applying some logs while I recall some from tapes.
Let me know what you think.
Agreed it might be possible of not having wals at the moment and waiting for them to copy, however, I prefer in that case to use hot_standby. Pause just works in case if it sees some pending file in Arch.. location.
My explanation might not reach to your expectation, but I am sure few other's here might share their inputs.
--Raghav
Could someone please comment on this? Thanks.
From: Murthy Nunna
Sent: Tuesday, February 25, 2014 11:33 PM
To: Murthy Nunna; Raghavendra
Cc: desmodemone; pgsql-admin@postgresql.org
Subject: RE: [ADMIN] PITR
All,
My database is in continuous recovery but it is not a traditional standby setup…. I am manually copying WALs (several thousands) from the source server to the server I am recovering. This means recovery is being done while WALs are being copied. It could so happen that a particular WAL could be open for recovery while it is still being copied (and not closed). Is this ok? I am asking because, I am seeing lots of LOG messages “unexpected pageaddr” as below.
I can do this differently if this method is not supported. I don’t want to end up with a corrupted database.
Thanks in advance for your advice.
2014-02-25 23:24:27 CST []LOG: unexpected pageaddr 15D1/6E000000 in log file 5587, segment 22, offset 0
cp: cannot stat `/data1/pg_archlogs/ifb_prd/00000003000015D300000016': No such file or directory
2014-02-25 23:24:31 CST []LOG: restored log file "00000003000015D300000016" from archive
2014-02-25 23:24:31 CST []LOG: restored log file "00000003000015D300000017" from archive
2014-02-25 23:24:32 CST []LOG: restored log file "00000003000015D300000018" from archive
2014-02-25 23:24:32 CST []LOG: restored log file "00000003000015D300000019" from archive
2014-02-25 23:24:32 CST []LOG: restored log file "00000003000015D30000001A" from archive
2014-02-25 23:24:32 CST []LOG: restored log file "00000003000015D30000001B" from archive
2014-02-25 23:24:32 CST []LOG: restored log file "00000003000015D30000001C" from archive
2014-02-25 23:24:32 CST []LOG: restored log file "00000003000015D30000001D" from archive
2014-02-25 23:24:32 CST []LOG: restored log file "00000003000015D30000001E" from archive
2014-02-25 23:24:33 CST []LOG: unexpected pageaddr 15D1/77000000 in log file 5587, segment 31, offset 0
2014-02-25 23:24:36 CST []LOG: restored log file "00000003000015D30000001F" from archive
2014-02-25 23:24:36 CST []LOG: restored log file "00000003000015D300000020" from archive
2014-02-25 23:24:36 CST []LOG: restored log file "00000003000015D300000021" from archive
2014-02-25 23:24:36 CST []LOG: restored log file "00000003000015D300000022" from archive
2014-02-25 23:24:36 CST []LOG: restored log file "00000003000015D300000023" from archive
2014-02-25 23:24:36 CST []LOG: restored log file "00000003000015D300000024" from archive
2014-02-25 23:24:36 CST []LOG: restored log file "00000003000015D300000025" from archive
2014-02-25 23:24:37 CST []LOG: unexpected pageaddr 15D1/7E000000 in log file 5587, segment 38, offset 0
2014-02-25 23:24:41 CST []LOG: restored log file "00000003000015D300000026" from archive
2014-02-25 23:24:41 CST []LOG: restored log file "00000003000015D300000027" from archive
2014-02-25 23:24:41 CST []LOG: restored log file "00000003000015D300000028" from archive
2014-02-25 23:24:41 CST []LOG: restored log file "00000003000015D300000029" from archive
2014-02-25 23:24:41 CST []LOG: restored log file "00000003000015D30000002A" from archive
2014-02-25 23:24:41 CST []LOG: restored log file "00000003000015D30000002B" from archive
2014-02-25 23:24:41 CST []LOG: restored log file "00000003000015D30000002C" from archive
2014-02-25 23:24:42 CST []LOG: restored log file "00000003000015D30000002D" from archive
2014-02-25 23:24:42 CST []LOG: restored log file "00000003000015D30000002E" from archive
2014-02-25 23:24:42 CST []LOG: unexpected pageaddr 15D1/26000000 in log file 5587, segment 47, offset 0
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Murthy Nunna
Sent: Monday, February 24, 2014 12:14 PM
To: Raghavendra
Cc: desmodemone; pgsql-admin@postgresql.org
Subject: Re: [ADMIN] PITR
Hi Raghavendra,
I used standby_mode=on and it worked. I can put checkpoints (not database checkpoint J) in between and still be in recovery state. This is what I wanted.
Thanks for your help!
Murthy
From: Raghavendra [mailto:raghavendra.rao@enterprisedb.com]
Sent: Monday, February 24, 2014 6:31 AM
To: Murthy Nunna
Cc: desmodemone; pgsql-admin@postgresql.org
Subject: Re: [ADMIN] PITR
On Sun, Feb 23, 2014 at 8:50 PM, Murthy Nunna <mnunna@fnal.gov> wrote:
Raghavendra,
Thanks for testing and confirming the behavior of “pause” setting.
While I understand your explanation, I feel I am still missing something. IMHO, when I say pause using “pause” setting, no matter what, I expect the recovery to wait for manual intervention.
I very much agree with your point that it has to pause when you ask for it, however, as per design (some other might comment on this well) am guessing it will open the database if no wals are there though you intentionally hide them.
You can use (HOT STANDBY) standby_mode=on which does the same thing, it just waits for the WAL files but it won't open the database until you pass the trigger file. In hot standby, it apply the existing wals fed and wait for coming wals and it won't come out of recovery. This you can try with below link.
I myself can come up with number of reasons for doing so… e.g I may be purposely “hiding” some WALs somewhere else, or maybe I have several thousands of WALs that I want to parallelize the process of applying some logs while I recall some from tapes.
Let me know what you think.
Agreed it might be possible of not having wals at the moment and waiting for them to copy, however, I prefer in that case to use hot_standby. Pause just works in case if it sees some pending file in Arch.. location.
My explanation might not reach to your expectation, but I am sure few other's here might share their inputs.
--Raghav
Could someone please comment on this? Thanks.