Actually, the affair had some good side: as usual I was checking
my own designs first and looking for flaws, and indeed I found one:
If you do copy out the archive logs not directly to tape, but to
some disk area for further processing, then there is an issue with
possible loss. If you do it like the docs say, with a command like
this:
archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p
+/mnt/server/archivedir/%f' # Unix
That "cp" is usually not synchronous. So there is the possibility
that this command terminates successfully, and reports exitcode zero
back to the Postgres, and then the Postgres will consider that log
being safely away.
But the target of the copy may not yet been written to disk. If
at that point a power loss happens, the log may become missing/damaged/
incomplete, while the database may or may not consider it done
when restarting.
Therefore, mounting such a target filesystem in all-synchronous mode
might be a good idea. (UFS: "-o sync", ZFS: "set sync=always")
cheerio,
PMc