On Wed, Feb 20, 2013 at 7:54 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Feb 19, 2013 at 5:48 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> I agree with Merlin and Joachim - if we have the call in one place, we
>> should have it in both.
>
> We might want to assess whether we even want to have it one place.
> I've seen cases where the existing call hurts performance, because of
> WAL file recycling. If we don't flush the WAL file blocks out of
> cache, then they're still there when we recycle the WAL file and we
> can overwrite them without further I/O. But if we tell the OS to blow
> them away, then it has to reread them when we try to overwrite the old
> files, and so we stall waiting for the I/O.
Does the kernel really read a data block from disk into memory in
order to immediately overwrite it? I would have thought it would
optimize that away, at least if the writes are sized and aligned to
512 or 1024 bytes blocks (which WAL should be). Well, stranger things
than that happen, I guess. (For example on ext4, when a file with
dirty pages goes away due to another file getting renamed over the top
of it, the disappearing file automatically gets fsynced, or the
equivalent.)
Cheers,
Jeff