Re: our buffer replacement strategy is kind of lame - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: our buffer replacement strategy is kind of lame
Date
Msg-id 4F197A48.20606@enterprisedb.com
Whole thread Raw
In response to Re: our buffer replacement strategy is kind of lame  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: our buffer replacement strategy is kind of lame  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 03.01.2012 17:56, Simon Riggs wrote:
> On Tue, Jan 3, 2012 at 3:18 PM, Robert Haas<robertmhaas@gmail.com>  wrote:
>
>>> 2. When a backend can't find a free buffer, it spins for a long time
>>> while holding the lock. This makes the buffer strategy O(N) in its
>>> worst case, which slows everything down. Notably, while this is
>>> happening the bgwriter sits doing nothing at all, right at the moment
>>> when it is most needed. The Clock algorithm is an approximation of an
>>> LRU, so is already suboptimal as a "perfect cache". Tweaking to avoid
>>> worst case behaviour makes sense. How much to tweak? Well,...
>>
>> I generally agree with this analysis, but I don't think the proposed
>> patch is going to solve the problem.  It may have some merit as a way
>> of limiting the worst case behavior.  For example, if every shared
>> buffer has a reference count of 5, the first buffer allocation that
>> misses is going to have a lot of work to do before it can actually
>> come up with a victim.  But I don't think it's going to provide good
>> scaling in general.  Even if the background writer only spins through,
>> on average, ten or fifteen buffers before finding one to evict, that
>> still means we're acquiring ten or fifteen spinlocks while holding
>> BufFreelistLock. I don't currently have the measurements to prove
>> that's too expensive, but I bet it is.
>
> I think its worth reducing the cost of scanning, but that has little
> to do with solving the O(N) problem. I think we need both.
>
> I've left the way open for you to redesign freelist management in many
> ways. Please take the opportunity and go for it, though we must
> realise that major overhauls require significantly more testing to
> prove their worth. Reducing spinlocking only sounds like a good way to
> proceed for this release.
>
> If you don't have time in 9.2, then these two small patches are worth
> having. The bgwriter locking patch needs less formal evidence to show
> its worth. We simply don't need to have a bgwriter that just sits
> waiting doing nothing.

I'd like to see some benchmarks that show a benefit from these patches, 
before committing something like this that complicates the code. These 
patches are fairly small, but nevertheless. Once we have a test case, we 
can argue whether the benefit we're seeing is worth the extra code, or 
if there's some better way to achieve it.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: Command Triggers
Next
From: Magnus Hagander
Date:
Subject: Re: pg_basebackup option for handling symlinks