On Sun, 2005-02-06 at 19:30 -0500, Tom Lane wrote:
> This would be pretty good from a locking point of view, except that
> "update the LRU state" seems to require taking an exclusive lock on a
> global data structure, which puts us about back where we were.
We're only concerned with a buffer's access recency when it is on the
free list, right? (That is certainly true with naive LRU; I confess I
haven't had a chance to grok the 2Q paper yet). If so:
- when ReadBuffer() is called on an in-cache but previously-free buffer,
increment the buffer's ref count; this logically removes it from the
LRU, but does not actually do the physical list removal.
- when we need to fault in a page from disk, acquire the LRU lock and
select a buffer for replacement. At this point we can do some physical
cleanup of the free list, by removing buffers with a non-zero refcount
from the free list. We can do this cleanup relatively cheaply now,
because we had to acquire the LRU lock anyway, and this is the slow path
of ReadBuffer(). This work might also be done by the bgwriter (removing
clean but pinned buffers from close to the LRU of the free list).
- we need only update a pinned buffer's LRU position when its shared
refcount drops to zero, not on every ReleaseBuffer()
This means we need to acquire the LRU lock once for each acquire/release
pair on a previously-unpinned buffer (rather than twice, if we had to
acquire the LRU lock on acquire as well). Not sure if that's enough to
remove the contention problem; on the plus side, it doesn't change the
actual behavior of the replacement policy.
> Perhaps someone knows of a neat data structure that can maintain an LRU
> list with only local updates? I don't though.
What operations does 2Q require on the shared lists? (Assuming that's
the replacement policy we're going with.) Depending on how complex the
list modifications are, non-blocking algorithms might be worth
considering. For example, to remove a node from the middle of a linked
list can be done via atomic CAS; you need to redo the CAS in the
presence of a concurrent modification of the particular node you're
trying to modify, but that means we are contending over individual list
nodes, not the list as a whole.
-Neil