Re: our buffer replacement strategy is kind of lame - Mailing list pgsql-hackers

From Robert Haas
Subject Re: our buffer replacement strategy is kind of lame
Date
Msg-id CA+Tgmoa8H3FhpXMa=-Ne=oaBYebSfLXaPmdExEYe1eDx=sb7hg@mail.gmail.com
Whole thread Raw
In response to Re: our buffer replacement strategy is kind of lame  (Greg Stark <stark@mit.edu>)
Responses Re: our buffer replacement strategy is kind of lame
List pgsql-hackers
On Fri, Aug 12, 2011 at 10:51 PM, Greg Stark <stark@mit.edu> wrote:
> On Fri, Aug 12, 2011 at 5:05 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Only 96 of the 14286 buffers in sample_data are in shared_buffers,
>> despite the fact that we have 37,218 *completely unused* buffers lying
>> around.  That sucks, because it means that the sample query did a
>> whole lot of buffer eviction that was completely needless.  The ring
>> buffer is there to prevent trashing the shared buffer arena, but here
>> it would have been fine to trash the arena: there wasn't anything
>> there we would have minded losing (or, indeed, anything at all).
>
> I don't disagree with the general thrust of your point, but I just
> wanted to point out one case where not using free buffers even though
> they're available might make sense:
>
> If you execute a large batch delete or update or even just set lots of
> hint bits you'll dirty a lot of buffers. The ring buffer forces the
> query that is actually dirtying all these buffers to also do the i/o
> to write them out. Otherwise you leave them behind to slow down other
> queries. This was one of the problems with the old vacuum code which
> the ring buffer replaced. It left behind lots of dirtied buffers for
> other queries to do i/o on.

Interesting point.

After thinking about this some more, I'm coming around to the idea
that we need to distinguish between:

1. Ensuring a sufficient supply of evictable buffers, and
2. Evicting a buffer.

The second obviously needs to be done only when needed, but the first
one should really be done as background work.  Currently, the clock
sweep serves both functions, and that's not good.  We shouldn't ever
let ourselves get to the point where there are no buffers at all with
reference count zero, so that the next guy who needs a buffer has to
spin the clock hand around until the reference counts get low enough.
Maintaining a sufficient supply of refcount-zero buffers should be
done as a background task; and possibly we ought to put them all in a
linked list so that the next guy who needs a buffer can just pop one
off.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: psql: bogus descriptions displayed by \d+
Next
From: Tom Lane
Date:
Subject: Re: psql: bogus descriptions displayed by \d+