>> Fair enough; I'm so used to bumping wal_buffers up to 16MB nowadays that
>> I forget sometimes that people actually run with the default where this
>> becomes an important consideration.
>
> Do you have any testing in favor of 16mb vs. lower/higher?
From some tests I had done some time ago, using separate spindles (RAID1)
for xlog, no battery, on 8.4, with stuff that generates lots of xlog
(INSERT INTO SELECT) :
When using a small wal_buffers, there was a problem when switching from
one xlog file to the next. Basically a fsync was issued, but most of the
previous log segment was still not written. So, postgres was waiting for
the fsync to finish. Of course, the default 64 kB of wal_buffers is
quickly filled up, and all writes wait for the end of this fsync. This
caused hiccups in the xlog traffic, and xlog throughput wassn't nearly as
high as the disks would allow. Sticking a sthetoscope on the xlog
harddrives revealed a lot more random accesses that I would have liked
(this is a much simpler solution than tracing the IOs, lol)
I set wal writer delay to a very low setting (I dont remember which,
perhaps 1 ms) so the walwriter was in effect constantly flushing the wal
buffers to disk. I also used fdatasync instead of fsync. Then I set
wal_buffers to a rather high value, like 32-64 MB. Throughput and
performance were a lot better, and the xlog drives made a much more
"linear-access" noise.
What happened is that, since wal_buffers was larger than what the drives
can write in 1-2 rotations, it could absorb wal traffic during the time
postgres waits for fdatasync / wal segment change, so the inserts would
not have to wait. And lowering the walwriter delay made it write something
on each disk rotation, so that when a COMMIT or segment switch came, most
of the time, the WAL was already synced and there was no wait.
Just my 2 c ;)