Re: Cannot change archive_command with a reload - Mailing list pgsql-admin

From CS DBA
Subject Re: Cannot change archive_command with a reload
Date
Msg-id 52EE77BA.4080609@consistentstate.com
Whole thread Raw
In response to Re: Cannot change archive_command with a reload  (CS DBA <cs_dba@consistentstate.com>)
List pgsql-admin
On 2/2/14, 9:26 AM, CS DBA wrote:
On 2/2/14, 9:19 AM, Raghavendra wrote:


On Sun, Feb 2, 2014 at 9:22 PM, CS DBA <cs_dba@consistentstate.com> wrote:
Hi all;

we have a cluster running with an archive command that is failing.
I tried a reload and the value does not change


Reload with new value won't work until you fix the failing archive_command.

Eg:-

ps -ef | grep arch
postgres 29743 29736  0 Jan30 ?        00:00:00 postgres: archiver process   failed on 00000001000000010000003D

and it might cause same error you are experiencing as 

postgres=# select set_config('archive_command','cp %p /opt/PostgreSQL/9.3/a93/%f',false);
ERROR:  parameter "archive_command" cannot be changed now

So, fix the archive_command first and then give new changes.

Tried a set_config and I get this error:

select set_config ('archive_command', 'cp %p /data/wal_tmp/%f && mv /data/wal_tmp/%f /data/wal/&f', 'false');
ERROR:  parameter "archive_command" cannot be changed now


If your archive_command pointing to "/data/wal_tmp" then please ensure that directory exists there. If that directory not present then archive process fail to copy the transaction logs. When archiver process in failed state you cannot apply any new changes to the archive_command it will fail. 

Also, am surprised to see "mv" command, basically the archive_command meant to have a copies of pg_xlogs but not any OS related directory movements.

we want to copy the file to /data/wal_tmp, then do a mv to /data/wal so our process that ships a copy to the standby servers never see's a partial file (since it watches /data/wal)



---
Regards,
Raghavendra
EnterpriseDB Corporation




So, I found that the archive command has a typeo (using a &f instead of a %f at the end):
cp %p /data/wal_tmp/%f && mv /data/wal_tmp/%f /data/wal/&f


I don't see any way to make the command as is "work" so I suspect the only option is a restart.

Question:  as a growing stack of archive commands fails (see log entry example below), is it going to take an increasingly longer and longer time to do a restart?

Log entry:
sh: f: command not found
11631    2014-02-02 09:49:41 MST [2014-02-02 01:04:54 MST] [3820] LOG:  archive command failed with exit code 127
11631    2014-02-02 09:49:41 MST [2014-02-02 01:04:54 MST] [3821] DETAIL:  The failed archive command was: test ! -f /data/wal/00000001000006E50000001C && cp pg_xlog/00000001000006E50000001C /data/wal_tmp/00000001000006E50000001C && mv /data/wal_tmp/00000001000006E50000001C /data/wal/&f
11631    2014-02-02 09:49:41 MST [2014-02-02 01:04:54 MST] [3822] WARNING:  transaction log file "00000001000006E50000001C" could not be archived: too many failures




pgsql-admin by date:

Previous
From: Raghavendra
Date:
Subject: Re: Cannot change archive_command with a reload
Next
From: CS DBA
Date:
Subject: Re: Cannot change archive_command with a reload