Yesyerday I had an issue where Postgres could not write to a fsm file
and started creating WAL files until it filled up the disk space on the
partition where the WAL files were located and panicked. I moved the
WASL files to the data partition, restarted postgres, and after about an
hour everything became operational. The system had created 43GB of WAL
files. At the earliest opportunity I would like to move the WAL files
to their own set of spindles (since it is currently affecting
performance), but I still have an excessive number of WAL files which
leave me very little space available on the other spindle.
Looking at pg_xlogs, I see that postgres is sequentially reusing all of
the available WAL files, while looking at the log file, it is recycling
approximately 70-80 files every checkpoint, but it is not removing any
files.
2013-01-08 16:51:47 GMT LOG: checkpoint starting: time
2013-01-08 17:03:41 GMT LOG: checkpoint complete: wrote 27167 buffers
(6.9%); 0 transaction log file(s) added, 0 removed, 70 recycled;
write=712.867 s, sync=1.534 s, total=714.489 s
2013-01-08 17:51:47 GMT LOG: checkpoint starting: time
2013-01-08 18:03:48 GMT LOG: checkpoint complete: wrote 27278 buffers
(6.9%); 0 transaction log file(s) added, 0 removed, 71 recycled;
write=719.655 s, sync=1.689 s, total=721.399 s
2013-01-08 18:51:47 GMT LOG: checkpoint starting: time
2013-01-08 19:10:33 GMT LOG: checkpoint complete: wrote 42856 buffers
(10.9%); 0 transaction log file(s) added, 0 removed, 69 recycled;
write=1124.766 s, sync=1.426 s, total=1126.232 s
2013-01-08 19:51:47 GMT LOG: checkpoint starting: time
2013-01-08 20:03:56 GMT LOG: checkpoint complete: wrote 27759 buffers
(7.1%); 0 transaction log file(s) added, 0 removed, 81 recycled;
write=727.997 s, sync=1.197 s, total=729.258 s
2013-01-08 20:51:47 GMT LOG: checkpoint starting: time
2013-01-08 21:03:02 GMT LOG: checkpoint complete: wrote 25799 buffers
(6.6%); 0 transaction log file(s) added, 0 removed, 70 recycled;
write=674.490 s, sync=1.191 s, total=675.757 s
2013-01-08 21:51:47 GMT LOG: checkpoint starting: time
2013-01-08 22:03:41 GMT LOG: checkpoint complete: wrote 27177 buffers
(6.9%); 0 transaction log file(s) added, 0 removed, 66 recycled;
write=712.874 s, sync=1.458 s, total=714.391 s
2013-01-08 22:51:47 GMT LOG: checkpoint starting: time
2013-01-08 23:02:48 GMT LOG: checkpoint complete: wrote 25292 buffers
(6.4%); 0 transaction log file(s) added, 0 removed, 66 recycled;
write=660.326 s, sync=1.397 s, total=661.902 s
Can anything be done (even if it requires restarting the postgresql
server) to cause it to stop utilizing all the other WAL files (there are
over 2000 of them). I need to contain the utilization of all these WAL
files so I can move pg_xlog to another spindle.
Thanks in advance.
Postgres 9.0.4, FreeBSD 8.1/amd64