I have a two-database pair with a master and hot-standby. The hot-standby is connected to the master via streaming replication, but I also archive WAL files from master to standby just for safety.
It occurs to me that this is a waste of bandwidth. I could set wal_keep_segments high and forget about the archiving, but what happens if disaster strikes and the backup is down for too long?
I'm wondering if there's any way to write an archive_command script that:
1) Looks at the WAL file to figure out where it is in the transaction stream.
2) Connects to the backup server to figure out where *it* is.
3) If the backup server has already streamed all of the contents of the to-be-archived WAL file, just throw it away. Otherwise (or if step 2 failed) archive the WAL file.
I'm not sure about the timing. It could be that it's impossible for all the WAL records to have been streamed already by the time a WAL file is ready to be archived. Maybe the archive_command could cheat and delay for a few seconds to see if the hot_standby can catch up. But I really would like to find some sort of adaptive way to decide whether or not a WAL file *really* needs to be archived over to the hot-standby for safety.