On Fri, Sep 19, 2014 at 1:07 AM, Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote:
We kept the WAL files and log files for further analysis. How can we help regarding this issue?
Commit c2f79ba has added as assumption that the WAL receiver should always enforce the create of .done files when WAL files are done being streamed (XLogWalRcvWrite and WalReceiverMain) or archived (KeepFileRestoredFromArchive). Then using this assumption 1bd42cd has changed a bit RemoveOldXlogFiles, removing a check looking if the node is in recovery. Now, based on the information given here yes it happens that there are still cases where .done file creation is not correctly done, leading to those extra files. Even by looking at the code, I am not directly seeing any code paths where an extra call to XLogArchiveForceDone would be needed on the WAL receiver side but... Something like the patch attached (which is clearly a band-aid) may help though as it would make files to be removed even if they are not marked as .done for a node in recovery. And this is consistent with the pre-1bd42cd.