It is a script although I am not the author just a lonely dba with weird problems:)
But it does not appear to be messing with any wal files in the xlog or xlog/archive_status dirs. I guess I am just confused why its about a file that is not there, yet everything seems to be working as it should.
On Fri, Jan 9, 2015 at 12:50 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
"Mathis, Jason" <jmathis@enova.com> wrote: > On Fri, Jan 9, 2015 at 11:58 AM, Kevin Grittner <kgrittn@ymail.com> wrote: >> "Mathis, Jason" <jmathis@enova.com> wrote: >> >>> I am getting some weird archiving failed messages in the logs >>> but nothing is failing. I *think* its just exiting with a non >>> zero code from the script. >> >> The archive command (script or not) must exit with 0 if >> successful. A nonzero exit code indicates failure. > > But if it was failing then I would be having a stack of *.ready > files right? In fact I should have a > "0000000100002ED0000000CB.ready" file but I don't. It weird man, > so weird.
Well, you didn't show this script. Is it perhaps directly messing with the *.ready files instead of leaving them to PostgreSQL to manage? If it's doing one thing wrong (i.e., using a nonzero exit code after copying to the archive directory) perhaps it's doing something else wrong (like removing the copied WAL file or messing with the *.ready files). If you mess with internals that you shouldn't be touching, you can expect weird.
Perhaps the script was initially only getting one of these things wrong and it was causing failures, so the author went further down the rabbit hole in an attempt to get it sorta working?