Re: StrategyGetBuffer optimization, take 2 - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: StrategyGetBuffer optimization, take 2 |
Date | |
Msg-id | 006501ce93f3$216646d0$6432d470$@kapila@huawei.com Whole thread Raw |
In response to | Re: StrategyGetBuffer optimization, take 2 (Merlin Moncure <mmoncure@gmail.com>) |
List | pgsql-hackers |
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers- > owner@postgresql.org] On Behalf Of Merlin Moncure > Sent: Thursday, August 08, 2013 12:09 AM > To: Andres Freund > Cc: PostgreSQL-development; Jeff Janes > Subject: Re: [HACKERS] StrategyGetBuffer optimization, take 2 > > On Wed, Aug 7, 2013 at 12:07 PM, Andres Freund <andres@2ndquadrant.com> > wrote: > > On 2013-08-07 09:40:24 -0500, Merlin Moncure wrote: > >> > I don't think the unlocked increment of nextVictimBuffer is a good > idea > >> > though. nextVictimBuffer jumping over NBuffers under concurrency > seems > >> > like a recipe for disaster to me. At the very, very least it will > need a > >> > good wad of comments explaining what it means and how you're > allowed to > >> > use it. The current way will lead to at least bgwriter accessing a > >> > nonexistant/out of bounds buffer via StrategySyncStart(). > >> > Possibly it won't even save that much, it might just increase the > >> > contention on the buffer header spinlock's cacheline. > >> > >> I agree; at least then it's not unambiguously better. if you (in > >> effect) swap all contention on allocation from a lwlock to a > spinlock > >> it's not clear if you're improving things; it would have to be > proven > >> and I'm trying to keep things simple. > > > > I think converting it to a spinlock actually is a good idea, you just > > need to expand the scope a bit. > > all right: well, I'll work up another version doing full spinlock and > maybe see things shake out in performance. > > > FWIW, I am not convinced this is the trigger for the problems you're > > seing. It's a good idea nonetheless though. > > I have some very strong evidence that the problem is coming out of the > buffer allocator. Exhibit A is that vlad's presentation of the > problem was on a read only load (if not allocator lock, then what?). > Exhibit B is that lowering shared buffers to 2gb seems to have (so > far, 5 days in) fixed the issue. This problem shows up on fast > machines with fast storage and lots of cores. So what I think is > happening is that usage_count starts creeping up faster than it gets > cleared by the sweep with very large buffer settings which in turn > causes the 'problem' buffers to be analyzed for eviction more often. Yes one idea which was discussed previously is to not increase usage count, every time buffer is pinned. I am also working on some of the optimizations on similar area, which you can refer here: http://www.postgresql.org/message-id/006e01ce926c$c7768680$56639380$@kapila@ huawei.com > What is not as clear is if the proposed optimizations will fix the > problem -- I'd have to get approval to test and confirm them in > production which seems unlikely at this juncture; that's why I'm > trying to keep things 'win-win' so as to not have to have them be > accepted on that basis. With Regards, Amit Kapila.
pgsql-hackers by date: