Re: BBU Cache vs. spindles - Mailing list pgsql-performance

From Tom Lane
Subject Re: BBU Cache vs. spindles
Date
Msg-id 26348.1288301177@sss.pgh.pa.us
Whole thread Raw
In response to Re: BBU Cache vs. spindles  (James Mansion <james@mansionfamily.plus.com>)
Responses Re: BBU Cache vs. spindles  (Robert Haas <robertmhaas@gmail.com>)
Re: BBU Cache vs. spindles  (James Mansion <james@mansionfamily.plus.com>)
List pgsql-performance
James Mansion <james@mansionfamily.plus.com> writes:
> Tom Lane wrote:
>> The other and probably worse problem is that there's no application
>> control over how soon changes to mmap'd pages get to disk.  An msync
>> will flush them out, but the kernel is free to write dirty pages sooner.
>> So if they're depending for consistency on writes not happening until
>> msync, it's broken by design.  (This is one of the big reasons we don't
>> use mmap'd space for Postgres disk buffers.)

> Well, I agree that it sucks for the reason you give - but you use
> write and that's *exactly* the same in terms of when it gets written,
> as when you update a byte on an mmap'd page.

Uh, no, it is not.  The difference is that we can update a byte in a
shared buffer, and know that it *isn't* getting written out before we
say so.  If the buffer were mmap'd then we'd have no control over that,
which makes it mighty hard to obey the WAL "write log before data"
paradigm.

It's true that we don't know whether write() causes an immediate or
delayed disk write, but we generally don't care that much.  What we do
care about is being able to ensure that a WAL write happens before the
data write, and with mmap we don't have control over that.

            regards, tom lane

pgsql-performance by date:

Previous
From: Ben
Date:
Subject: Re: partitioning question 1
Next
From: Trenta sis
Date:
Subject: Re: Massive update, memory usage