I've checked wal_keep_segments before i've started the backup and the xlog dir on the master contained logs from several hours ago. Which was sufficient. Also if this would be the case, then copying affected clog file from the master wouldn't solve the problem as the requires wals would be still missing..
Just the config files for the replica, all other dirs were removed.
Alright, lets look at the other end. You are using -x which is:
Using this option is equivalent of using -X with method fetch.
and
-X fetch is:
f fetch
The transaction log files are collected at the end of the backup. Therefore, it is necessary for the wal_keep_segments parameter to be set high enough that the log is not removed before the end of the backup. If the log has been rotated when it's time to transfer it, the backup will fail and be unusable.
So are the three servers that failed pulling from parents that are seeing heavy use and do not have a sufficiently large wal_keep_segments set?