Re: Clock with Adaptive Replacement - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Clock with Adaptive Replacement
Date
Msg-id CA+TgmobLDVTsCfGOOA8HN1s5LjQCGfOCLK7LQgB-vXfJmaQJZg@mail.gmail.com
Whole thread Raw
In response to Clock with Adaptive Replacement  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Responses Re: Clock with Adaptive Replacement
Re: Clock with Adaptive Replacement
List pgsql-hackers
On Thu, Feb 11, 2016 at 4:02 PM, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> What do you think about improving cache replacement clock-sweep algorithm in
> PostgreSQL with adaptive version proposed in this article:
>
>     http://www-cs.stanford.edu/~sbansal/pubs/fast04.pdf
>
> Are there some well known drawbacks of this approach or it will be
> interesting to adopt this algorithm to PostgreSQL and measure it impact om
> performance under different workloads?
> I find this ten years old thread:
>
> http://www.postgresql.org/message-id/flat/d2jkde$6bg$1@sea.gmane.org#d2jkde$6bg$1@sea.gmane.org
>
> but it mostly discus possible patent issues with another algorithm ARC (CAR
> is inspired by ARC,  but it is different algorithm).
> As far as I know there are several problems with current clock-sweep
> algorithm in PostgreSQL, especially for very large caches.
> May be CAR can address some of them?

Maybe, but the proof of the pudding is in the eating.  Just because an
algorithm is smarter, newer, and better in general than our current
algorithm - and really, it wouldn't be hard - doesn't mean that it
will actually solve the problems we care about.  A few of my
EnterpriseDB colleagues spent a lot of time benchmarking various
tweaks to our current algorithm last year and were unable to construct
a test case where it sped anything up.  If they tried the same tweaks
against the 9.4 source base, they could get a speedup.  But 9.5 had
locking improvements around buffer eviction, and with those
improvements committed there was no longer any measurable benefit to
improving the quality of buffer eviction decisions.  That's a
surprising result, to me anyway, and somebody else might well find a
test case where a benefit can be shown - but our research was not
successful.

I think it's important to spend time and energy figuring out exactly
what the problems with our current algorithm are.  We know in general
terms that usage counts tend to converge to either 5 or 0 and
therefore sometimes evict buffers both at great cost and almost
randomly.  But what's a lot less clear is how much that actually hurts
us given that we are relying on the OS cache anyway.  It may be that
we need to fix some other things before or after improving the buffer
eviction algorithm before we actually get a performance benefit.  I
suspect, for example, that a lot of the problems with large
shared_buffers settings have to do with the bgwriter and checkpointer
behavior rather than with the buffer eviction algorithm; and that
others have to do with cache duplication between PostgreSQL and the
operating system.  So, I would suggest (although of course it's up to
you) that you might want to focus on experiments that will help you
understand where the problems are before you plunge into writing code
to fix them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Kohei KaiGai
Date:
Subject: Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)
Next
From: Teodor Sigaev
Date:
Subject: Re: GinPageIs* don't actually return a boolean