> That makes sense. Each iteration of the restartpoint recycle loop has a 1/N > chance of failing. Recovery adds >N files between restartpoints. Hence, the > WAL directory grows without bound. Is that roughly the theory in mind?
Yes, though you have formulated it better than I had in my mind.
Let's see if Don can confirm that he is seeing the "could not link file" messages.
During my latest incident, there was only one occurrence:
could not link file “pg_wal/xlogtemp.18799" to “pg_wal/000000010000D45300000010”: File exists
WAL restore/recovery seemed to continue on just fine then. And it would continue on until the pg_wal volume ran out of space unless I was manually rm'ing already-recovered WAL files from the side.