Re: Page replacement algorithm in buffer cache - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: Page replacement algorithm in buffer cache
Date
Msg-id CAHyXU0ygrf+Fz52sxYHkRUbZJR+CG4GcK5a14Rsb+zsVh2Q8gg@mail.gmail.com
Whole thread Raw
In response to Re: Page replacement algorithm in buffer cache  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Page replacement algorithm in buffer cache
List pgsql-hackers
On Tue, Apr 2, 2013 at 9:55 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Apr 2, 2013 at 1:53 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> That seems pretty unlikely because of A sheer luck of hitting that
>> page for the dropout (if your buffer count is N the chances of losing
>> it would seem to be 1/N at most) and B highly used pages are much more
>> likely to be pinned and thus immune from eviction.  But my issue with
>> this whole line of analysis is that I've never been able to to turn it
>> up in simulated testing.   Probably to do it you'd need very very fast
>> storage.
>
> Well, if you have shared_buffers=8GB, that's a million buffers.  One
> in a million events happen pretty frequently on a heavily loaded
> server, which, on recent versions of PostgreSQL, can support several
> hundred thousand queries per second, each of which accesses multiple
> buffers.
>
> I've definitely seen evidence that poor choices of which CLOG buffer
> to evict can result in a noticeable system-wide stall while everyone
> waits for it to be read back in.  I don't have any similar evidence
> for shared buffers, but I wouldn't be very surprised if the same
> danger exists there, too.

That's a very fair point, although not being able to evict pinned
buffers is a highly mitigating aspect.  Also CLOG is a different beast
entirely -- it's much more dense (2 bits!) vs a tuple so a single page
can a lot of high priority things.  But you could be right anyways.

Given that, I wouldn't feel very comfortable with forced eviction
without knowing for sure high priority buffers were immune from that.
Your nailing idea is maybe the ideal solution.   Messing around with
the usage_count mechanic is tempting (like raising the cap and making
the sweeper more aggressive as it iterates), but probably really
difficult to get right, and, hopefully, ultimately moot.
merlin



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: citext like searches using index
Next
From: Robert Haas
Date:
Subject: Re: Page replacement algorithm in buffer cache