Hi Kevin,
> It is generally unsafe to delete any WAL files from pg_xlog. If
> they are there because your archive command has been failing, you
> need to turn off archiving or (probably more convenient) allow the
> archive script to return success until things clear. One trick
> would be to temporarily change your archive_command to 'true',
> delete all files from your archive, and then change the command
> back. Doing that without exposing yourself to a period where you
> have no backup might be tricky, though.
I'm trying to see and understand your view point, but I couldn't able to get
this particular step clearly: "One trick would be to temporarily change
your archive_command to 'true', delete all files from your archive, and then
change the command back ". Can you please clarify and explain on this?
When you say *temporarily changing archive_command to 'true' *, do you mean
enabling/disabling of WAL archiving here? Per documentation, "If this is an
empty string (the default), WAL archiving is disabled.". And when you say
"change the command back", I understood it as *disabling*. Is my
understanding correct?
> If the only problem with the archive command is that the archive fs
> is full, I would copy the contents of the archive directory to tape
> or whatever medium you have for long-term storage, delete the
> contents, and let archive succeed. The pg_xlog directory will
> eventually clear, and then I would get a fresh PITR base backup
> (following all the documented steps for doing so). You really want
> to see WAL files flowing to your archive location before you start
> the process of getting a new base backup.
Yes, probably I should go ahead and proceed as you suggested above, that is
allowing archive script to run successfully until things are completely
clear.
> If there's some other reason that the archive command has been
> failing, what is it?
No other reason. It was failing only because my WAL archive drive was full.
Regards,
Gnanam