Hi,
On 2019-06-18 12:33:30 -0400, Alvaro Herrera wrote:
> On 2019-Jun-18, Andres Freund wrote:
>
> > On 2019-06-17 19:45:41 -0400, Jeff Janes wrote:
> > > If not, I would set the value small (say, 8GB) and let the OS do the
> > > heavy lifting of deciding what to keep in cache.
> >
> > FWIW, in my opinion this is not a good idea in most cases. E.g. linux's
> > pagecache doesn't scale particularly gracefully to large amounts of
> > data, and it's decisions when to evict data aren't really better than
> > postgres'. And there's a significant potential for additional
> > unnecessary disk writes (because the kernel will flush dirty pagecache
> > buffers, and then we'll just re-issue many of those writes again).
>
> One additional tuning point in this area is that you need to tweak the
> Linux flusher so that it starts writing earlier than what it does by
> default (by setting dirty_background_bytes to some reasonable value --
> as I recall it normally runs on the dirty_background_ratio setting,
> which means it scales up when you add RAM, which I'm not sure is really
> sensible since you really care about the disk I/O write rate rather than
> anything about the memory). If you don't do that, the kernel can
> accumulate lots of memory to write before starting to write it, which is
> troublesome once it does.
I think that's less needed these days - by default postgres triggers the
backend to flush data when writes occur from bgwriter
(bgwriter_flush_after) or checkpointer (checkpointer_flush_after). And
IMO one should turn on the flushing by backends in most cases too
(e.g. backend_flush_after=2MB), unless it's a really latency/jitter
insensitive application, or storage is *REALLY* slow.
There's a few things we don't flush that we maybe should (file extension
writes, SLRUs), so it can still be sensible to tune
dirty_background_bytes. But that has the disadvantage of also affecting
temp file writes etc, which is usually not wanted.
Greetings,
Andres Freund