Robert Haas <robertmhaas@gmail.com> writes:
> I think that the bottom line is that we're not likely to make massive
> changes to the way that we do block caching now. Even if some other
> scheme could work much better on Linux (and so far I'm unconvinced
> that any of the proposals made here would in fact work much better),
> we aim to be portable to Windows as well as other UNIX-like systems
> (BSD, Solaris, etc.). So using completely Linux-specific technology
> in an overhaul of our block cache seems to me to have no future.
Unfortunately, I have to agree with this. Even if there were a way to
merge our internal buffers with the kernel's, it would surely be far
too invasive to coexist with buffer management that'd still work on
more traditional platforms.
But we could add hint calls, or modify the I/O calls we use, and that
ought to be a reasonably localized change.
> And the idea of being able to do an 8kB atomic write with OS support
> so that we don't have to save full page images in our write-ahead log
> to cover the "torn page" scenario seems very intriguing indeed. If
> that worked well, it would be a *big* deal for us.
+1. That would be a significant win, and trivial to implement, since
we already have a way to switch off full-page images for people who
trust their filesystems to do atomic writes. It's just that safe
use of that switch isn't widely possible ...
regards, tom lane