On Mon, Jan 17, 2022 at 11:23 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> IIRC one issue with this has been performance. When an SLRU is working
> well, a cache hit in the SLRU is very cheap. Linearly scanning the SLRU
> array is cheap, compared to computing the hash and looking up a buffer
> in the buffer cache. Would be good to do some benchmarking of that.
One trick I want to experiment with is trying to avoid the mapping
table lookup by using ReadRecentBuffer(). In the prototype patch I do
that for one buffer per SLRU cache (so that'll often point to the head
CLOG page), but it could be extended to a small array of recently
accessed buffers to scan linearly, much like the current SLRU happy
case except that it's not shared and doesn't need a lock so it's even
happier. I'm half joking here, but that would let us keep calling
this subsystem SLRU :-)
> I wanted to do this with the CSN (Commit Sequence Number) work a long
> time ago. That would benefit from something like an SLRU with very fast
> access, to look up CSNs. Even without CSNs, if the CLOG was very fast to
> access we might not need hint bits anymore. I guess trying to make SLRUs
> faster than they already are is moving the goalposts, but at a minimum
> let's make sure they don't get any slower.
One idea I've toyed with is putting a bitmap into shared memory where
each bit corresponds to a range of (say) 256 xids, accessed purely
with atomics. If a bit is set we know they're all committed, and
otherwise you have to do more pushups. I've attached a quick and
dirty experimental patch along those lines, but I don't think it's
quite right, especially with people waving 64 bit xid patches around.
Perhaps it'd need to be a smaller sliding window, somehow, and perhaps
people will balk at the wild and unsubstantiated assumption that
transactions generally commit.