we have some problems with the xlog-directory (size 5 GB). When we take a backup from one of our databases (about 80 GB data) and use an own script called by recovery.conf it begins to start the consumation of the wal-files fetched from our nas at a rate of about 2 per second.
Postgres just will not clean up the xlog-directory in time, so it runs full and next essential wal-file file could not be copied.
So my script stops for 5 minutes and sometimes the xlog is cleaned up a bit (from 100% to 84%) and the game starts again, there also has been some kind of lock and nothing happened until I restarted the cluster. On a restart the directory is cleaned up and about one half of the 5 GB will become available.
Checkpoint_segments are set to 64, wal-file-size is 16 MB, so at about 300 wal-files the whole game is over even if there should never be more then 3 * 64 + xxx, what should be under 200.
Why do we get a lot more then 300 wal-files and what can we do against this, even 10 GB xlog-size is not enough to keep up some kind of balance.