Re: StrategyGetBuffer questions - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: StrategyGetBuffer questions |
Date | |
Msg-id | 006101cdc863$0e030990$2a091cb0$@kapila@huawei.com Whole thread Raw |
In response to | Re: StrategyGetBuffer questions (Merlin Moncure <mmoncure@gmail.com>) |
List | pgsql-hackers |
On Thursday, November 22, 2012 3:26 AM Merlin Moncure wrote: > On Tue, Nov 20, 2012 at 4:50 PM, Jeff Janes <jeff.janes@gmail.com> > wrote: > > On Tue, Nov 20, 2012 at 1:26 PM, Merlin Moncure <mmoncure@gmail.com> > wrote: > >> In this sprawling thread on scaling issues [1], the topic meandered > >> into StrategyGetBuffer() -- in particular the clock sweep loop. I'm > >> wondering: > >> > >> *) If there shouldn't be a a bound in terms of how many candidate > >> buffers you're allowed to skip for having a non-zero usage count. > >> Whenever an unpinned usage_count>0 buffer is found, trycounter is > >> reset (!) so that the code operates from point of view as it had just > >> entered the loop. There is an implicit assumption that this is rare, > >> but how rare is it? > > > > How often is that the trycounter would hit zero if it were not reset? > > I've instrumented something like that in the past, and could only get > > it to fire under pathologically small shared_buffers and workloads > > that caused most of them to be pinned simultaneously. > > well, it's basically impossible -- and that's what I find odd. > > >> *) Shouldn't StrategyGetBuffer() bias down usage_count if it finds > >> itself examining too many unpinned buffers per sweep? > > > > It is a self correcting problem. If it is examining a lot of unpinned > > buffers, it is also decrementing a lot of them. > > sure. but it's entirely plausible that some backends are marking up > usage_count rapidly and not allocating buffers while others are doing > a lot of allocations. point being: all it takes is one backend to get > scheduled out while holding the freelist lock to effectively freeze > the database for many operations. True, even this is observed by me, the case I have tried for this was that when 50% of buffers are always used for some hot table(table in high demand) and rest 50% of buffers are used for normal ops. In such cases contention can be observed for BufFreelistLock. The same can be seen in the Report I have attachedin below mail. http://archives.postgresql.org/pgsql-hackers/2012-11/msg01147.php > it's been documented [1] that particular buffers can become spinlock > contention hot spots due to reference counting of the pins. if a lot > of allocation is happening concurrently it's only a matter of time > before the clock sweep rolls around to one of them, hits the spinlock, > and (in the worst case) schedules out. this could in turn shut down > the clock sweep for some time and non allocating backends might then > beat on established buffers and pumping up usage counts. > > The reference counting problem might be alleviated in some fashion for > example via Robert's idea to disable reference counting under > contention [2]. Even if you do that. you're still in for a world of > hurt if you get scheduled out of a buffer allocation. Your patch > fixes that AFAICT. The buffer pin check is outside the wider lock, > making my suggestion to be less rigorous about usage_count a lot less > useful (but perhaps not completely useless!). > > Another innovation might be to implement a 'trylock' variant of > LockBufHdr that does a TAS but doesn't spin -- if someone else has the > header locked, why bother waiting for it? just skip to the next and > move on. .. I think this is reasonable idea. How about having Hot and Cold end in the buffer list, so that if some of the buffers are heavily used, then other backends might not need to pay penality for traversing Hot Buffers. With Regards, Amit Kapila.
pgsql-hackers by date: