Thread: restore_command ignored in recovery.conf on standby

restore_command ignored in recovery.conf on standby

From
Rene Romero Benavides
Date:
- Base backup taken with 9.2.6 (via pg_basebackup command)
- binaries updated to 9.2.8
- set up the base backup to replicate from the master and archives, and started
- the restore_command option is ignored, with the following message:

2014-04-13 21:07:21.386 CDT,,,22055,,534b42d7.5627,4,,2014-04-13 21:07:19 CDT,1/0,0,LOG,00000,"consistent recovery state reached at 1E6/F9FFE880",,,,,,,,"CheckRecoveryConsistency, xlog.c:7371",""
2014-04-13 21:07:21.387 CDT,,,22053,,534b42d6.5625,1,,2014-04-13 21:07:18 CDT,,0,LOG,00000,"database system is ready to accept read only connections",,,,,,,,"sigusr1_handler, postmaster.c:4261",""

# recovery.conf

standby_mode=on
restore_command='/bin/tar -xzf /db/wal_archives/%f.tar.gz -C %p'

where the /db/wal_archives/ looks like this:
00000001000001ED000000F7.tar.gz
00000001000001ED000000F8.tar.gz
00000001000001ED000000F9.tar.gz

as you can see, the time line is far ahead from where the standby claims to have reached a consistent recovery state

I tested the restore_command replacing variables and it works. Any ideas on why it isn't being executed?


--
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/

Re: restore_command ignored in recovery.conf on standby

From
Stephen Frost
Date:
Rene,

* Rene Romero Benavides (rene.romero.b@gmail.com) wrote:
> restore_command='/bin/tar -xzf /db/wal_archives/%f.tar.gz -C %p'
[...]
> I tested the restore_command replacing variables and it works. Any ideas on
> why it isn't being executed?

Are you sure that it isn't being executed and just immediately returning
'1' (meaning 'false'- aka, done with recovery)?

The -C option to tar is supposed to be "change directory" according to
the tar that I've got, and %p is the complete file name that PG wants
the WAL file to be copied to- it's not a directory (it's something like
pg_xlog/RECOVERY_WAL).

    Thanks,

        Stephen

Attachment

Re: restore_command ignored in recovery.conf on standby

From
Rene Romero Benavides
Date:
Yep, I checked:

[postgres@uxmal standby_node]$ /bin/tar -xzf /db/wal_archives/00000001000001ED000000FB.tar.gz -C /db/standby_node/pg_xlog/
[postgres@uxmal standby_node]$ echo $?
0
[postgres@uxmal standby_node]$ ls /db/standby_node/pg_xlog/ | grep 00000001000001ED000000FB
00000001000001ED000000FB

I read somewhere that in order for the extracted file to be placed at a custom location you had to use that option -C

I'll try rewriting the command and debug it. Thanks for your comment.



2014-04-13 21:39 GMT-05:00 Stephen Frost <sfrost@snowman.net>:
Rene,

* Rene Romero Benavides (rene.romero.b@gmail.com) wrote:
> restore_command='/bin/tar -xzf /db/wal_archives/%f.tar.gz -C %p'
[...]
> I tested the restore_command replacing variables and it works. Any ideas on
> why it isn't being executed?

Are you sure that it isn't being executed and just immediately returning
'1' (meaning 'false'- aka, done with recovery)?

The -C option to tar is supposed to be "change directory" according to
the tar that I've got, and %p is the complete file name that PG wants
the WAL file to be copied to- it's not a directory (it's something like
pg_xlog/RECOVERY_WAL).

        Thanks,

                Stephen



--
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/

Re: restore_command ignored in recovery.conf on standby

From
Stephen Frost
Date:
* Rene Romero Benavides (rene.romero.b@gmail.com) wrote:
> Yep, I checked:
>
> [postgres@uxmal standby_node]$ /bin/tar -xzf
> /db/wal_archives/00000001000001ED000000FB.tar.gz -C
> /db/standby_node/pg_xlog/
> [postgres@uxmal standby_node]$ echo $?
> 0

Err, sure, but that isn't actually what is being passed via %p.  %p will
be something like 'pg_xlog/RECOVERY_WAL', as I said, which *won't* work
for your tar command, eg:

sfrost@tamriel:/home/sfrost> tar -xzf zz.tar.gz -C zz/zz
tar: zz/zz: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
sfrost@tamriel:/home/sfrost> echo $?
2

> [postgres@uxmal standby_node]$ ls /db/standby_node/pg_xlog/ | grep
> 00000001000001ED000000FB
> 00000001000001ED000000FB

Noooo, PG tells you via %p the *specific* filename to use, do not just
overwrite files in pg_xlog willy-nilly with a tar command.

    Thanks,

        Stephen

Attachment

Re: restore_command ignored in recovery.conf on standby

From
Rene Romero Benavides
Date:
What I did (I bet there's a better way) is this:
restore_command='/db/standby_node/scripts/wal_restore.sh %f %p'

# wal_restore.sh
#!/bin/bash
/bin/tar -xzf /db/wal_archives/$1.tar.gz -C /tmp
cp /tmp/$1 $2
rm /tmp/$1


My best regards to Stephen Frost.


2014-04-13 21:58 GMT-05:00 Stephen Frost <sfrost@snowman.net>:
* Rene Romero Benavides (rene.romero.b@gmail.com) wrote:
> Yep, I checked:
>
> [postgres@uxmal standby_node]$ /bin/tar -xzf
> /db/wal_archives/00000001000001ED000000FB.tar.gz -C
> /db/standby_node/pg_xlog/
> [postgres@uxmal standby_node]$ echo $?
> 0

Err, sure, but that isn't actually what is being passed via %p.  %p will
be something like 'pg_xlog/RECOVERY_WAL', as I said, which *won't* work
for your tar command, eg:

sfrost@tamriel:/home/sfrost> tar -xzf zz.tar.gz -C zz/zz
tar: zz/zz: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
sfrost@tamriel:/home/sfrost> echo $?
2

> [postgres@uxmal standby_node]$ ls /db/standby_node/pg_xlog/ | grep
> 00000001000001ED000000FB
> 00000001000001ED000000FB

Noooo, PG tells you via %p the *specific* filename to use, do not just
overwrite files in pg_xlog willy-nilly with a tar command.

        Thanks,

                Stephen



--
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/

Re: restore_command ignored in recovery.conf on standby

From
Stephen Frost
Date:
Rene,

* Rene Romero Benavides (rene.romero.b@gmail.com) wrote:
> What I did (I bet there's a better way) is this:
> restore_command='/db/standby_node/scripts/wal_restore.sh %f %p'
>
> # wal_restore.sh
> #!/bin/bash
> /bin/tar -xzf /db/wal_archives/$1.tar.gz -C /tmp
> cp /tmp/$1 $2
> rm /tmp/$1

You'll probably want to be more careful here- this script could exit
with 'success' (meaning zero) even if some of the above commands fail.
When writing reliable shell scripts, you really need to check the exit
status of each command.  Note that you can return a high-value (>128,
iirc) from your shell script to indicate 'permanent' failure while
trying to do WAL recovery and PG will give up and stop trying.

Is there any particular reason you're tar'ing up the WAL files in the
first place..?  It'd surely be easier if you simply gzip'd them and then
used something like 'zcat /path/to/wal/archive/%f.gz > %p'.

The other option, if you really want to keep them tar'd, would be to use
tar's -O option, eg:

tar -O -zxf /db/wal_archives/%f.tar.gz %f > %p

There is also a --transform option that you could pass to tar to change
the filenames.

> My best regards to Stephen Frost.

Thanks!

    Stephen

Attachment

Re: restore_command ignored in recovery.conf on standby

From
Jeff Janes
Date:


On Apr 13, 2014 7:30 PM, "Rene Romero Benavides" <rene.romero.b@gmail.com> wrote:
>
> - Base backup taken with 9.2.6 (via pg_basebackup command)
> - binaries updated to 9.2.8
> - set up the base backup to replicate from the master and archives, and started
> - the restore_command option is ignored, with the following message:
>
> 2014-04-13 21:07:21.386 CDT,,,22055,,534b42d7.5627,4,,2014-04-13 21:07:19 CDT,1/0,0,LOG,00000,"consistent recovery state reached at 1E6/F9FFE880",,,,,,,,"CheckRecoveryConsistency, xlog.c:7371",""
> 2014-04-13 21:07:21.387 CDT,,,22053,,534b42d6.5625,1,,2014-04-13 21:07:18 CDT,,0,LOG,00000,"database system is ready to accept read only connections",,,,,,,,"sigusr1_handler, postmaster.c:4261",""
>

Are you sure there is actually a problem? "Ready to accept read-only connections" doesn't mean recovery has ended.

Cheers,

Jeff