Greg Stark wrote:
>Manfred Spraul <manfred@colorfullife.com> writes:
>
>
>
>>One problem for WAL is that O_DIRECT would disable the write cache -
>>each operation would block until the data arrived on disk, and that might block
>>other backends that try to access WALWriteLock.
>>Perhaps a dedicated backend that does the writeback could fix that.
>>
>>
>
>aio seems a better fit.
>
>
>
>>Has anyone tried to use posix_fadvise for the wal logs?
>>http://www.opengroup.org/onlinepubs/007904975/functions/posix_fadvise.html
>>
>>Linux supports posix_fadvise, it seems to be part of xopen2k.
>>
>>
>
>Odd, I don't see it anywhere in the kernel. I don't know what syscall it's
>using to do this tweaking.
>
>
At least in 2.6: linux/mm/fadvise.c, the syscall is fadvise64 or 64_64
>This is the only option that seems useful for postgres for both the WAL and
>vacuum (though in other threads it seems the problems with vacuum lie
>elsewhere):
>
> POSIX_FADV_DONTNEED attempts to free cached pages associated with the
> specified region. This is useful, for example, while streaming large
> files. A program may periodically request the kernel to free cached
> data that has already been used, so that more useful cached pages are
> not discarded instead.
>
> Pages that have not yet been written out will be unaffected, so if the
> application wishes to guarantee that pages will be released, it should
> call fsync or fdatasync first.
>
>
I agree. Either immediately after each flush syscall, or just before
closing a log file and switching to the next.
>Perhaps POSIX_FADV_RANDOM and POSIX_FADV_SEQUENTIAL could be useful in a
>backend before starting a sequential scan or index scan, but I kind of doubt
>it.
>
>
IIRC the recommendation is ~20% total memory for the postgres user space
buffers. That's quite a lot - it might be sufficient to protect that
cache from vacuum or sequential scans. AddBufferToFreeList already
contains a comment that this is the right place to try buffer
replacement strategies.
-- Manfred