Re: 2nd Level Buffer Cache - Mailing list pgsql-hackers

From Greg Stark
Subject Re: 2nd Level Buffer Cache
Date
Msg-id AANLkTi=A-F4UCTwroykQqvcn0i6pa4uyvh-nGFf02ppO@mail.gmail.com
Whole thread Raw
In response to Re: 2nd Level Buffer Cache  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: 2nd Level Buffer Cache
List pgsql-hackers
On Wed, Mar 23, 2011 at 8:00 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> It looks like the only way anything can ever get put on the free list
> right now is if a relation or database is dropped.  That doesn't seem
> too good.  I wonder if the background writer shouldn't be trying to
> maintain the free list.  That is, perhaps BgBufferSync() should notice
> when the number of free buffers drops below some threshold, and run
> the clock sweep enough to get it back up to that threshold.
>

I think this is just a terminology discrepancy. In postgres the free
list is only used for buffers that contain no useful data at all. The
only time there are buffers on the free list is at startup or if a
relation or database is dropped.

Most of the time blocks are read into buffers that already contain
other data. Candidate buffers to evict are buffers that have been used
least recently. That's what the clock sweep implements.

What the bgwriter's responsible for is looking at the buffers *ahead*
of the clock sweep and flushing them to disk. They stay in ram and
don't go on the free list, all that changes is that they're clean and
therefore can be reused without having to do any i/o.

I'm a bit skeptical that this works because as soon as bgwriter
saturates the i/o the os will throttle the rate at which it can write.
When that happens even a few dozens of milliseconds will be plenty to
allow the purely user-space processes consuming the buffers to catch
up instantly.

But Greg Smith has done a lot of work tuning the bgwriter so that it
is at least useful in some circumstances. I could well see it being
useful for systems where latency matters and the i/o is not saturated.

--
greg


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: psql \dt and table size
Next
From: Simon Riggs
Date:
Subject: Re: Re: making write location work (was: Efficient transaction-controlled synchronous replication)