"Tom Lane" <tgl@sss.pgh.pa.us> writes:
> Gregory Stark <gsstark@mit.edu> writes:
>> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>>> I'm still struggling to understand why and how bgwriter increases performance.
>>> Under what circumstances, what workload?
>>>
>>> The only benefit I can see is that it moves the write() of a page out of the
>>> critical path. But as long as the OS cache can absorb the write, it should be
>>> very cheap compared to doing real I/O.
>
>> Well you can't keep writing indefinitely faster than the i/o subsystem can
>> execute the writes. Eventually the kernel will block your write until a kernel
>> buffer becomes free. Ie, throttle your writes to the actual write bandwidth
>> available.
>
> Right. Also, a buffer write isn't "merely" a kernel call --- for
> instance, you might have to flush some more WAL before you can execute
> the write, and there are cases where you'd have to fsync the write
> yourself (ie, if you can't pass it off to the bgwriter). The more of
> that we can take out of foreground query paths, the better.
So it sounds like a good place to start to try to benchmark something where
bgwriter helps might be a setup which starts with a very large table (like
1-10M rows) and each transaction deletes a large number of random tuples (~ 1k
rows). Possibly waiting briefly before committing to give a chance for the
dirty pages to be needed before commit flushes the wal.
That way each transaction dirties a large number of pages and the next
transaction is likely to need a fresh page and find one which has been dirtied
and not had its wal record flushed. Deletes mean not much wal will be
generated so wal won't be a bottleneck and you won't get checkpoints due to
checkpoint_segments being reached.
-- Gregory Stark EnterpriseDB http://www.enterprisedb.com