Re: Page replacement algorithm in buffer cache - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Page replacement algorithm in buffer cache |
Date | |
Msg-id | 007001ce3274$05774220$1065c660$@kapila@huawei.com Whole thread Raw |
In response to | Re: Page replacement algorithm in buffer cache (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On Saturday, April 06, 2013 12:38 AM Robert Haas wrote: > On Fri, Apr 5, 2013 at 1:12 AM, Amit Kapila <amit.kapila@huawei.com> > wrote: > > If we just put it to freelist, then next time if it get allocated > directly > > from bufhash table, then who will remove it from freelist > > or do you think that, in BufferAlloc, if it gets from bufhash table, > then it > > should verify if it's in freelist, then remove from freelist. > > No, I don't think that's necessary. We already have the following > guard in StrategyGetBuffer: > > if (buf->refcount == 0 && buf->usage_count == 0) > { > if (strategy != NULL) > AddBufferToRing(strategy, buf); > return buf; > } > > If a buffer is allocated from the freelist and it turns out that it > actually has a non-zero reference count or a non-zero pin count, we > just discard it and pull the next buffer off the freelist instead. > So, in the scenario you describe, the buffer gets reallocated (due to > a non-NULL BufferAccessStrategy, presumably) and then somebody comes a > long and pulls it off the freelist. But, since the buffer has just > been used by someone else, it'll most likely be pinned or have a > non-zero usage count, so we'll just skip it and allocate some other > buffer instead. No harm done. Yes, you are right, I have missed that part of code while thinking of this scenario, but I was talking about NULL BufferAccessStrategy as well. I still have one more doubt, consider the below scenario for cases when we Invalidate buffers during moving to freelist v/s just move to freelist Backend got the buffer from freelist for a request of page-9 (number 9 is random, just to explain), it still have association with another page-10 It needs to add the buffer with new tag (new pageassociation) in bufhash table and remove the buffer with oldTag (old page association). The benefit for just moving to freelist is that if we get request of same page until somebody else used it for another page, it will save read I/O. However on the other side for many cases Backend will need extra partition lock to remove oldTag (which can lead to some bottleneck). I think saving read I/O is more beneficial but just not sure if that is best as cases might be less for it. > Now, it is possible that the buffer could get added to the freelist, > then allocated via a BufferAccessStrategy, and then the clock sweep > could hit it and push the usage count back to 0. But that's no big > deal either: if we go to put it on the freelist and see (via > buf->freeNext) that it's already there, we can just leave it where it > is (or maybe move it to the end). On a related note, we probably need > a variant of StrategyFreeBuffer which pushes buffers onto the end of > the freelist rather than the front. It makes sense to stick > invalidated buffers on the front of the list (which is what > StrategyFreeBuffer does), but non-invalidated buffers should be placed > at the end to more closely approximate LRU. Okay. Last time following tests have been executed to validate the results: Test suite - pgbench DB Size - 16 GB RAM - 24 GB Shared Buffers - 2G, 5G, 7G, 10G Concurrency - 8, 16, 32, 64 clients Pre-warm the buffers before start of test Shall we try for any other scenario's or for initial test of patch above are okay. With Regards, Amit Kapila.
pgsql-hackers by date: