Re: Thinking about breaking up the BufMgrLock - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Thinking about breaking up the BufMgrLock
Date
Msg-id 18180.1107741078@sss.pgh.pa.us
Whole thread Raw
In response to Re: Thinking about breaking up the BufMgrLock  (Neil Conway <neilc@samurai.com>)
List pgsql-hackers
Neil Conway <neilc@samurai.com> writes:
> We're only concerned with a buffer's access recency when it is on the
> free list, right?

Right; pinned buffers are not logically part of the freelist.  (Whether
they are so physically is open to choice.)

Another free variable, AFAICS, is whether to do the move-to-front during
ReadBuffer or during ReleaseBuffer.  You could possibly argue that the
latter would be slightly more accurate, but it probably doesn't matter
much.

> This means we need to acquire the LRU lock once for each acquire/release
> pair on a previously-unpinned buffer (rather than twice, if we had to
> acquire the LRU lock on acquire as well).

I think this is true of any scheme that separates the lookup table from
LRU management: you hit the LRU overhead during acquire, or during
release, but not both.  I suspect that we need more than a 2X reduction
in contention to fix the problem, though.

> What operations does 2Q require on the shared lists?

Remove from list, insert at head of list (often but not always paired as
a move-to-front operation).  I think we also have an insert-at-tail case
for VACUUM.  One difficulty is that we need to maintain a list length
counter, which is a central point of contention except for the
move-to-front-of-same-list case.

> For example, to remove a node from the middle of a linked
> list can be done via atomic CAS; you need to redo the CAS in the
> presence of a concurrent modification of the particular node you're
> trying to modify, but that means we are contending over individual list
> nodes, not the list as a whole.

Unfortunately, most of the time you need to touch the list head, so I'm
unclear that this buys us much, at least if we try to preserve true
"move to front" behavior.  It might be interesting to consider "move
ahead one place" instead of "move to front" --- that would avoid the
pressure on the list head.  Again you have to worry whether you're
blowing off too much performance in the cache management algorithm...
        regards, tom lane


pgsql-hackers by date:

Previous
From: Neil Conway
Date:
Subject: Re: Thinking about breaking up the BufMgrLock
Next
From: Greg Stark
Date:
Subject: Re: Thinking about breaking up the BufMgrLock