Re: StrategyGetBuffer optimization, take 2 - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: StrategyGetBuffer optimization, take 2
Date
Msg-id CAHyXU0yeDSf5SMKKS_xET5a2+iTEiiryiGL501Jcs_3Lwim2GA@mail.gmail.com
Whole thread Raw
In response to Re: StrategyGetBuffer optimization, take 2  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: StrategyGetBuffer optimization, take 2  (Amit Kapila <amit.kapila@huawei.com>)
List pgsql-hackers
On Wed, Aug 7, 2013 at 12:07 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-08-07 09:40:24 -0500, Merlin Moncure wrote:
>> > I don't think the unlocked increment of nextVictimBuffer is a good idea
>> > though. nextVictimBuffer jumping over NBuffers under concurrency seems
>> > like a recipe for disaster to me. At the very, very least it will need a
>> > good wad of comments explaining what it means and how you're allowed to
>> > use it. The current way will lead to at least bgwriter accessing a
>> > nonexistant/out of bounds buffer via StrategySyncStart().
>> > Possibly it won't even save that much, it might just increase the
>> > contention on the buffer header spinlock's cacheline.
>>
>> I agree; at least then it's not unambiguously better. if you (in
>> effect) swap all contention on allocation from a lwlock to a spinlock
>> it's not clear if you're improving things; it would have to be proven
>> and I'm trying to keep things simple.
>
> I think converting it to a spinlock actually is a good idea, you just
> need to expand the scope a bit.

all right: well, I'll work up another version doing full spinlock and
maybe see things shake out in performance.

> FWIW, I am not convinced this is the trigger for the problems you're
> seing. It's a good idea nonetheless though.

I have some very strong evidence that the problem is coming out of the
buffer allocator.  Exhibit A is that vlad's presentation of the
problem was on a read only load (if not allocator lock, then what?).
Exhibit B is that lowering shared buffers to 2gb seems to have (so
far, 5 days in) fixed the issue.  This problem shows up on fast
machines with fast storage and lots of cores.  So what I think is
happening is that usage_count starts creeping up faster than it gets
cleared by the sweep with very large buffer settings which in turn
causes the 'problem' buffers to be analyzed for eviction more often.
What is not as clear is if the proposed optimizations will fix the
problem -- I'd have to get approval to test and confirm them in
production which seems unlikely at this juncture; that's why I'm
trying to keep things 'win-win' so as to not have to have them be
accepted on that basis.

merlin



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Condition to become the standby mode.
Next
From: Josh Berkus
Date:
Subject: Re: Kudos for Reviewers -- wrapping it up