Thread: pg_standby
pg_standby and test framework, in separate .tar files pg_standby ---------- pg_standby is a production-ready program that can be used to create a Warm Standby server with PostgreSQL. The program is designed to be a wait-for restore_command, required to turn a normal archive recovery into a Warm Standby. Within the restore_command of the recovery.conf you could configure pg_standby in the following way: restore_command = 'pg_standby archiveDir %f %p' $pg_standby pg_standby allows Warm Standby servers to be configured Usage: pg_standby [OPTION]... [ARCHIVELOCATION] [NEXTWALFILE] [XLOGFILEPATH] note space between [ARCHIVELOCATION] and [NEXTWALFILE] with main intended use via restore_command in the recovery.conf restore_command = 'pg_standby [OPTION]... [ARCHIVELOCATION] %f %p' e.g. restore_command = 'pg_standby -m /mnt/server/archiverdir %f %p' Options: -d generate lots of debugging output (testing only) -m moves file rather than copying from archive -t [TRIGGERFILE] defines a trigger file to initiate failover (no default) -s [SLEEPTIME] number of seconds to wait between file checks (default=5) -w [MAXWAITTIME] max number of seconds to wait for a file (0 disables)(default=600) pg_standby runs standalone and as a restore_command. Tested and working successfully in both modes. No signal handling - do we need some? Works successfully with shutdown of standby server and via trigger file. test_warm_standby ----------------- bash script to run two PostgreSQL servers, one Primary, one Standby - both running on same system. Servers use non-standard port numbers deliberately, to avoid conflicts with other systems. Designed to be executed from /usr/local/pgsql, nothing too fancy File contents: $ tar tf pg_standby.tar contrib/pg_standby/ contrib/pg_standby/Makefile contrib/pg_standby/pg_standby.c contrib/pg_standby/README.pg_standby allows make, make install, make distclean intended for submission to core as a contrib module $tar tf test_warm_standby.tar test_warm_standby.primary.postgresql.conf test_warm_standby.standby.postgresql.conf test_warm_standby.standby.recovery.conf test_warm_standby.start.sh test_warm_standby.stop.sh needs some discussion, code needs enhancement before commit maybe implement config changes as edits rather than full scripts All feedback welcome. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Attachment
On Thu, 2006-12-14 at 12:04 +0000, Simon Riggs wrote: > pg_standby and test framework, in separate .tar files New version (v2), following further testing. Signal handling not included in this version. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Attachment
On Thu, 2006-12-28 at 19:26 +0000, Simon Riggs wrote: > On Thu, 2006-12-14 at 12:04 +0000, Simon Riggs wrote: > > pg_standby and test framework, in separate .tar files > > New version (v2), following further testing. > > Signal handling not included in this version. Signal handling now added, tested and working correctly in version 3, attached. pg_standby is an example program for a warm standby script as discussed on -hackers: http://archives.postgresql.org/pgsql-hackers/2006-08/msg00407.php Program looks complete and ready for review, to me. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Attachment
On 12/28/06, Simon Riggs <simon@2ndquadrant.com> wrote: > On Thu, 2006-12-28 at 19:26 +0000, Simon Riggs wrote: > > On Thu, 2006-12-14 at 12:04 +0000, Simon Riggs wrote: > > > pg_standby and test framework, in separate .tar files > > > > New version (v2), following further testing. > > > > Signal handling not included in this version. > > Signal handling now added, tested and working correctly in version 3, > attached. > > pg_standby is an example program for a warm standby script as discussed > on -hackers: > http://archives.postgresql.org/pgsql-hackers/2006-08/msg00407.php > > Program looks complete and ready for review, to me. I double checked and re-ran all my test and confirmed that pg_standby move (-m) mode is definitely busted in v3 in the sense that a restart of the standby will not resume recovery and requires a pg_resetxlog to become operational -- it needs one more WAL file back than the oldest one available. I am currently working around this by rotating WAL files a couple of versions back in the shell script I am using to receive log files via netcat. move mode is very desirable because it keeps the maintenance down for the standby system. merlin
I confirm that I am seeing the exact same characteristic. Could you post your rotating script?
Thanks,
Doug
On Wed, 2007-01-17 at 10:05 -0500, Merlin Moncure wrote:
Thanks,
Doug
On Wed, 2007-01-17 at 10:05 -0500, Merlin Moncure wrote:
On 12/28/06, Simon Riggs <simon@2ndquadrant.com> wrote: > On Thu, 2006-12-28 at 19:26 +0000, Simon Riggs wrote: > > On Thu, 2006-12-14 at 12:04 +0000, Simon Riggs wrote: > > > pg_standby and test framework, in separate .tar files > > > > New version (v2), following further testing. > > > > Signal handling not included in this version. > > Signal handling now added, tested and working correctly in version 3, > attached. > > pg_standby is an example program for a warm standby script as discussed > on -hackers: > http://archives.postgresql.org/pgsql-hackers/2006-08/msg00407.php > > Program looks complete and ready for review, to me. I double checked and re-ran all my test and confirmed that pg_standby move (-m) mode is definitely busted in v3 in the sense that a restart of the standby will not resume recovery and requires a pg_resetxlog to become operational -- it needs one more WAL file back than the oldest one available. I am currently working around this by rotating WAL files a couple of versions back in the shell script I am using to receive log files via netcat. move mode is very desirable because it keeps the maintenance down for the standby system. merlin
On 1/17/07, Doug Knight <dknight@wsi.com> wrote: > I confirm that I am seeing the exact same characteristic. Could you post > your rotating script? note: this is still a work in progress, the crude but effective sleep 5 is due to be replaced with a lock/fifo and there catch_wal.sh needs to be rewritten a bit. truncate is a C one-liner I wrote which does a ftruncate. *** primary: *** archive_command = '/home/postgres/send_wal.sh %p %f' *** send_wal.sh: *** !/bin/bash echo "archiving: $2" >> ~/send_wal.log cat $1 <(echo "placeholder") <(echo $2) | nc $STANDBY 1234 && sleep 5 *** secondary: *** restore_command = 'pg_standby -m -w0 -t/raid/pitr/kill /raid/pitr %f %p' *** catch_wal.sh *** !/bin/bash WALDIR=/raid/pitr rm -f $WALDIR/*.old rm -f $WALDIR/*.older > $WALDIR/tmp.older > $WALDIR/tmp.old while true; do tmpfile=`mktemp` nc -l 1234 > $tmpfile || { echo "FATAL: nc listen failed"; exit 1; } chown postgres:postgres $tmpfile file_name=`tail -1 $tmpfile` ./truncate $tmpfile 16777216 rm -f $WALDIR/*.older for i in `ls $WALDIR/*.old`; do mv $i $WALDIR/`basename $i .old`.older; done mv $tmpfile $WALDIR/$file_name.old cp --preserve=ownership $WALDIR/$file_name.old $WALDIR/$file_name echo "LOG: caught file: $file_name" done
On 1/17/07, Merlin Moncure <mmoncure@gmail.com> wrote: > On 1/17/07, Doug Knight <dknight@wsi.com> wrote: > > I confirm that I am seeing the exact same characteristic. Could you post > > your rotating script? > > note: this is still a work in progress, the crude but effective sleep > 5 is due to be replaced with a lock/fifo and there catch_wal.sh needs > to be rewritten a bit. truncate is a C one-liner I wrote which does a > ftruncate. > this turned out not to fix the problem...working on it still! merlin
On 1/17/07, Simon Riggs <simon@2ndquadrant.com> wrote: > new v4 > > Changes > - removed -m command, design flaw in original spec, use -l instead > - added -k N command to cleanup archive and leave max N files > - fflush() points added to allow Windows debug > - bug fix: when .history file present > - bug fix: command line switch cleanup > - readme updated works fantastic. grazi...i guess my rotation would have worked with more files but -k is much cleaner. merlin
On Wed, 2007-01-17 at 16:15 +0000, Simon Riggs wrote: > On Wed, 2007-01-17 at 10:05 -0500, Merlin Moncure wrote: > > On 12/28/06, Simon Riggs <simon@2ndquadrant.com> wrote: > > > On Thu, 2006-12-28 at 19:26 +0000, Simon Riggs wrote: > > > > On Thu, 2006-12-14 at 12:04 +0000, Simon Riggs wrote: > > > > > pg_standby and test framework, in separate .tar files > > > > > > > > New version (v2), following further testing. > > > > > > > > Signal handling not included in this version. > > > > > > Signal handling now added, tested and working correctly in version 3, > > > attached. > > > > > > pg_standby is an example program for a warm standby script as discussed > > > on -hackers: > > > http://archives.postgresql.org/pgsql-hackers/2006-08/msg00407.php > > > > > > Program looks complete and ready for review, to me. > > > > I double checked and re-ran all my test and confirmed that pg_standby > > move (-m) mode is definitely busted in v3 in the sense that a restart > > of the standby will not resume recovery and requires a pg_resetxlog to > > become operational -- it needs one more WAL file back than the oldest > > one available. > > new v4 > > Changes > - removed -m command, design flaw in original spec, use -l instead > - added -k N command to cleanup archive and leave max N files > - fflush() points added to allow Windows debug > - bug fix: when .history file present > - bug fix: command line switch cleanup > - readme updated new v6 (v5 was Windows dev release) Changes - added -r option to specify maxretries - -l option for Windows Vista (only) using mklink - Windows examples and docs added to readme - code restructured to allow more easy customization - bug fix: -k 0 error fixed - successful port report from Dave Page on Windows XP -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Attachment
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --------------------------------------------------------------------------- Simon Riggs wrote: > On Wed, 2007-01-17 at 16:15 +0000, Simon Riggs wrote: > > On Wed, 2007-01-17 at 10:05 -0500, Merlin Moncure wrote: > > > On 12/28/06, Simon Riggs <simon@2ndquadrant.com> wrote: > > > > On Thu, 2006-12-28 at 19:26 +0000, Simon Riggs wrote: > > > > > On Thu, 2006-12-14 at 12:04 +0000, Simon Riggs wrote: > > > > > > pg_standby and test framework, in separate .tar files > > > > > > > > > > New version (v2), following further testing. > > > > > > > > > > Signal handling not included in this version. > > > > > > > > Signal handling now added, tested and working correctly in version 3, > > > > attached. > > > > > > > > pg_standby is an example program for a warm standby script as discussed > > > > on -hackers: > > > > http://archives.postgresql.org/pgsql-hackers/2006-08/msg00407.php > > > > > > > > Program looks complete and ready for review, to me. > > > > > > I double checked and re-ran all my test and confirmed that pg_standby > > > move (-m) mode is definitely busted in v3 in the sense that a restart > > > of the standby will not resume recovery and requires a pg_resetxlog to > > > become operational -- it needs one more WAL file back than the oldest > > > one available. > > > > new v4 > > > > Changes > > - removed -m command, design flaw in original spec, use -l instead > > - added -k N command to cleanup archive and leave max N files > > - fflush() points added to allow Windows debug > > - bug fix: when .history file present > > - bug fix: command line switch cleanup > > - readme updated > > new v6 (v5 was Windows dev release) > > Changes > > - added -r option to specify maxretries > - -l option for Windows Vista (only) using mklink > - Windows examples and docs added to readme > - code restructured to allow more easy customization > - bug fix: -k 0 error fixed > > - successful port report from Dave Page on Windows XP > > -- > Simon Riggs > EnterpriseDB http://www.enterprisedb.com > [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Hi Simon,
Quick question on the -w option; setting it to zero "disables", do you mean it waits until the file appears or a trigger file appears, or it just doesn't wait at all?
Doug Knight
WSI Inc
Andover, MA
On Mon, 2007-01-22 at 13:06 +0000, Simon Riggs wrote:
Quick question on the -w option; setting it to zero "disables", do you mean it waits until the file appears or a trigger file appears, or it just doesn't wait at all?
Doug Knight
WSI Inc
Andover, MA
On Mon, 2007-01-22 at 13:06 +0000, Simon Riggs wrote:
On Wed, 2007-01-17 at 16:15 +0000, Simon Riggs wrote: > On Wed, 2007-01-17 at 10:05 -0500, Merlin Moncure wrote: > > On 12/28/06, Simon Riggs <simon@2ndquadrant.com> wrote: > > > On Thu, 2006-12-28 at 19:26 +0000, Simon Riggs wrote: > > > > On Thu, 2006-12-14 at 12:04 +0000, Simon Riggs wrote: > > > > > pg_standby and test framework, in separate .tar files > > > > > > > > New version (v2), following further testing. > > > > > > > > Signal handling not included in this version. > > > > > > Signal handling now added, tested and working correctly in version 3, > > > attached. > > > > > > pg_standby is an example program for a warm standby script as discussed > > > on -hackers: > > > http://archives.postgresql.org/pgsql-hackers/2006-08/msg00407.php > > > > > > Program looks complete and ready for review, to me. > > > > I double checked and re-ran all my test and confirmed that pg_standby > > move (-m) mode is definitely busted in v3 in the sense that a restart > > of the standby will not resume recovery and requires a pg_resetxlog to > > become operational -- it needs one more WAL file back than the oldest > > one available. > > new v4 > > Changes > - removed -m command, design flaw in original spec, use -l instead > - added -k N command to cleanup archive and leave max N files > - fflush() points added to allow Windows debug > - bug fix: when .history file present > - bug fix: command line switch cleanup > - readme updated new v6 (v5 was Windows dev release) Changes - added -r option to specify maxretries - -l option for Windows Vista (only) using mklink - Windows examples and docs added to readme - code restructured to allow more easy customization - bug fix: -k 0 error fixed - successful port report from Dave Page on Windows XP
On Thu, 2007-02-01 at 15:14 -0500, Doug Knight wrote: > Quick question on the -w option; setting it to zero "disables", do you > mean it waits until the file appears or a trigger file appears, or it > just doesn't wait at all? It means it waits forever, or until a trigger file appears - but a trigger file is optional, so its possible to create an awkward situation. I'm not happy with that default, but feedback from Merlin suggested production problems with people not understanding that. I'm happy to change to whatever consensus is, so if you think that's dumb, just shout. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Not at all, in fact I was planning on using the infinite wait, and using something like heartbeat to force creation of the trigger file in the event the primary dies. Thanks Simon!
Doug
On Fri, 2007-02-02 at 14:38 +0000, Simon Riggs wrote:
Doug
On Fri, 2007-02-02 at 14:38 +0000, Simon Riggs wrote:
On Thu, 2007-02-01 at 15:14 -0500, Doug Knight wrote: > Quick question on the -w option; setting it to zero "disables", do you > mean it waits until the file appears or a trigger file appears, or it > just doesn't wait at all? It means it waits forever, or until a trigger file appears - but a trigger file is optional, so its possible to create an awkward situation. I'm not happy with that default, but feedback from Merlin suggested production problems with people not understanding that. I'm happy to change to whatever consensus is, so if you think that's dumb, just shout.
Patch applied. Thanks. --------------------------------------------------------------------------- Simon Riggs wrote: > On Wed, 2007-01-17 at 16:15 +0000, Simon Riggs wrote: > > On Wed, 2007-01-17 at 10:05 -0500, Merlin Moncure wrote: > > > On 12/28/06, Simon Riggs <simon@2ndquadrant.com> wrote: > > > > On Thu, 2006-12-28 at 19:26 +0000, Simon Riggs wrote: > > > > > On Thu, 2006-12-14 at 12:04 +0000, Simon Riggs wrote: > > > > > > pg_standby and test framework, in separate .tar files > > > > > > > > > > New version (v2), following further testing. > > > > > > > > > > Signal handling not included in this version. > > > > > > > > Signal handling now added, tested and working correctly in version 3, > > > > attached. > > > > > > > > pg_standby is an example program for a warm standby script as discussed > > > > on -hackers: > > > > http://archives.postgresql.org/pgsql-hackers/2006-08/msg00407.php > > > > > > > > Program looks complete and ready for review, to me. > > > > > > I double checked and re-ran all my test and confirmed that pg_standby > > > move (-m) mode is definitely busted in v3 in the sense that a restart > > > of the standby will not resume recovery and requires a pg_resetxlog to > > > become operational -- it needs one more WAL file back than the oldest > > > one available. > > > > new v4 > > > > Changes > > - removed -m command, design flaw in original spec, use -l instead > > - added -k N command to cleanup archive and leave max N files > > - fflush() points added to allow Windows debug > > - bug fix: when .history file present > > - bug fix: command line switch cleanup > > - readme updated > > new v6 (v5 was Windows dev release) > > Changes > > - added -r option to specify maxretries > - -l option for Windows Vista (only) using mklink > - Windows examples and docs added to readme > - code restructured to allow more easy customization > - bug fix: -k 0 error fixed > > - successful port report from Dave Page on Windows XP > > -- > Simon Riggs > EnterpriseDB http://www.enterprisedb.com > [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
At the risk of starting trouble, is there some reason this was added to contrib and not put on pgfoundry ? On Thursday 08 February 2007 10:09, Bruce Momjian wrote: > Patch applied. Thanks. > > --------------------------------------------------------------------------- > > Simon Riggs wrote: > > On Wed, 2007-01-17 at 16:15 +0000, Simon Riggs wrote: > > > On Wed, 2007-01-17 at 10:05 -0500, Merlin Moncure wrote: > > > > On 12/28/06, Simon Riggs <simon@2ndquadrant.com> wrote: > > > > > On Thu, 2006-12-28 at 19:26 +0000, Simon Riggs wrote: > > > > > > On Thu, 2006-12-14 at 12:04 +0000, Simon Riggs wrote: > > > > > > > pg_standby and test framework, in separate .tar files > > > > > > > > > > > > New version (v2), following further testing. > > > > > > > > > > > > Signal handling not included in this version. > > > > > > > > > > Signal handling now added, tested and working correctly in version > > > > > 3, attached. > > > > > > > > > > pg_standby is an example program for a warm standby script as > > > > > discussed on -hackers: > > > > > http://archives.postgresql.org/pgsql-hackers/2006-08/msg00407.php > > > > > > > > > > Program looks complete and ready for review, to me. > > > > > > > > I double checked and re-ran all my test and confirmed that pg_standby > > > > move (-m) mode is definitely busted in v3 in the sense that a restart > > > > of the standby will not resume recovery and requires a pg_resetxlog > > > > to become operational -- it needs one more WAL file back than the > > > > oldest one available. > > > > > > new v4 > > > > > > Changes > > > - removed -m command, design flaw in original spec, use -l instead > > > - added -k N command to cleanup archive and leave max N files > > > - fflush() points added to allow Windows debug > > > - bug fix: when .history file present > > > - bug fix: command line switch cleanup > > > - readme updated > > > > new v6 (v5 was Windows dev release) > > > > Changes > > > > - added -r option to specify maxretries > > - -l option for Windows Vista (only) using mklink > > - Windows examples and docs added to readme > > - code restructured to allow more easy customization > > - bug fix: -k 0 error fixed > > > > - successful port report from Dave Page on Windows XP > > > > -- > > Simon Riggs > > EnterpriseDB http://www.enterprisedb.com > > [ Attachment, skipping... ] > > > ---------------------------(end of broadcast)--------------------------- > > TIP 1: if posting/reading through Usenet, please send an appropriate > > subscribe-nomail command to majordomo@postgresql.org so that your > > message can get through to the mailing list cleanly -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
Robert Treat wrote: > At the risk of starting trouble, is there some reason this was added to > contrib and not put on pgfoundry ? I thought the idea was that it was integral to using PITR, but might change so it was put in /contrib. > > On Thursday 08 February 2007 10:09, Bruce Momjian wrote: > > Patch applied. Thanks. > > > > --------------------------------------------------------------------------- > > > > Simon Riggs wrote: > > > On Wed, 2007-01-17 at 16:15 +0000, Simon Riggs wrote: > > > > On Wed, 2007-01-17 at 10:05 -0500, Merlin Moncure wrote: > > > > > On 12/28/06, Simon Riggs <simon@2ndquadrant.com> wrote: > > > > > > On Thu, 2006-12-28 at 19:26 +0000, Simon Riggs wrote: > > > > > > > On Thu, 2006-12-14 at 12:04 +0000, Simon Riggs wrote: > > > > > > > > pg_standby and test framework, in separate .tar files > > > > > > > > > > > > > > New version (v2), following further testing. > > > > > > > > > > > > > > Signal handling not included in this version. > > > > > > > > > > > > Signal handling now added, tested and working correctly in version > > > > > > 3, attached. > > > > > > > > > > > > pg_standby is an example program for a warm standby script as > > > > > > discussed on -hackers: > > > > > > http://archives.postgresql.org/pgsql-hackers/2006-08/msg00407.php > > > > > > > > > > > > Program looks complete and ready for review, to me. > > > > > > > > > > I double checked and re-ran all my test and confirmed that pg_standby > > > > > move (-m) mode is definitely busted in v3 in the sense that a restart > > > > > of the standby will not resume recovery and requires a pg_resetxlog > > > > > to become operational -- it needs one more WAL file back than the > > > > > oldest one available. > > > > > > > > new v4 > > > > > > > > Changes > > > > - removed -m command, design flaw in original spec, use -l instead > > > > - added -k N command to cleanup archive and leave max N files > > > > - fflush() points added to allow Windows debug > > > > - bug fix: when .history file present > > > > - bug fix: command line switch cleanup > > > > - readme updated > > > > > > new v6 (v5 was Windows dev release) > > > > > > Changes > > > > > > - added -r option to specify maxretries > > > - -l option for Windows Vista (only) using mklink > > > - Windows examples and docs added to readme > > > - code restructured to allow more easy customization > > > - bug fix: -k 0 error fixed > > > > > > - successful port report from Dave Page on Windows XP > > > > > > -- > > > Simon Riggs > > > EnterpriseDB http://www.enterprisedb.com > > > > [ Attachment, skipping... ] > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 1: if posting/reading through Usenet, please send an appropriate > > > subscribe-nomail command to majordomo@postgresql.org so that your > > > message can get through to the mailing list cleanly > > -- > Robert Treat > Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +