Re: [HACKERS] Clock with Adaptive Replacement - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: [HACKERS] Clock with Adaptive Replacement |
Date | |
Msg-id | CAH2-Wz=NvPOCL3_OcFR_6k6KKcK7GqhR-dqi4QfB28MD5NiycA@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Clock with Adaptive Replacement (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [HACKERS] Clock with Adaptive Replacement
Re: [HACKERS] Clock with Adaptive Replacement |
List | pgsql-hackers |
On Tue, May 1, 2018 at 11:58 AM, Robert Haas <robertmhaas@gmail.com> wrote: > That's not to say that the total of all usage counts must remain equal > to a constant or something dumb like that -- there's decisions to be > made in terms of how you implement things. For example, you can > imagine a system where every time we would bump the usage count of a > buffer that already has a usage count of 5, we would instead advance > the clock hand by 1 buffer. In such a system the total of the usage > counts can fluctuate, but you can't peg everything at 5 because > there's back-pressure built into the algorithm. This seems to be an old idea. See "Data Cache Management Using Frequency-Based Replacement", a 1990 paper that is referenced by the ARC paper. It says: "When a block is first brought into the cache, its reference count is initialized to one. Previous algorithms using reference counts have incremented the count for a given block on every reference to that block. This results in the following kind of problem: certain blocks are relatively infrequently referenced overall, and yet when they are referenced, due to locality there are short intervals of repeated re-references, thus building up high reference counts. After such an interval is over, the high reference count is misleading: it is due to locality, and cannot be used to estimate the probability that such a block will be re-referenced following the end of this interval (this description is intuitive, since in practice such intervals are not as well-defined as the preceding discussion may suggest)." It goes on to propose a solution, that I gather isn't so far from your thought experiment design for fixing the usage_count saturation problem: "More precisely, we dynamically maintain the sum of all reference counts. Whenever the average reference count exceeds a predetermined maximum value, A-max (another parameter of the algorithm), every reference count C is reduced to C/2. Thus, in the steady state the sum of all reference counts stays between N x A-max/2 and N x A-max (where N is the number of cache blocks), and the reference count of a block is approximately linearly related to the frequency of references to the block while it is in the middle and old sections. Note that in this reduction of reference counts, a count of one remains at one, a count of two is reduced to one, a count of three is reduced to two, etc." This 1990 paper seems to only try to solve the correlated references problem that we see when using CLOCK to approximate an LRU, which is not to be confused with actually weighing frequency in a clever way, which AFAICT is a general idea that didn't come along until the LRU-K paper came out a couple of years later. In other words, the 1990 paper only cares about making sure that CLOCK actually manages to approximate an LRU when reference counts are messed up by correlated references. That's all. This really is a distinct problem to the problem that ARC and CAR aim to solve (2Q and LRU-K must be in the same category, since IIRC they have ghost lists). CART tries to solve both problems at the same time, which suggests that they're independent problems. In summary: I see two clearly distinct problems here, though they are superficially similar. We have both problems. > That's a really appealing property. If there's a "hot" set of buffers > that accounts for nearly all accesses, the algorithm will be willing > to regard that all buffers in that set as frequently-used regardless > of whether it is a large or a small percentage of shared_buffers, and > they will never get evicted. But the buffers must really be hot > enough to justify the space they're taking up. When there are only a > few hot buffers, it's sufficient for them to be a little hotter than > the typical buffer. But when there are a lot of hot buffers, they > have to be *really* hot. Intuitively, that makes sense, because > making the set of "hot" -- and thus unevictable -- buffers large means > that other buffers are going to get evicted very quickly. That's only > justifiable if the hot buffers are very significantly more useful than > the non-hot buffers. Right. How hot "really hot" actually is needs to be based on the behavior of the system as a whole. If it's impossible to better characterize the behavior of the system as a whole, then no improvement is possible; shared_buffers services the system as a whole. I've always had a problem with the 8GB/16GB upper limit on the size of shared_buffers. Not because it's wrong, but because it's not something that has ever been explained. I strongly suspect that it has something to do with usage_count saturation, since it isn't reproducible with any synthetic workload that I'm aware of. Quite possibly because there are few bursty benchmarks. >> All of that having been said, maybe we have an independent low-level >> problem: we increase usage_count when we pin a buffer, even if we last >> pinned the buffer 5ms ago. IIRC a change to this went in when ARC went >> in (and came out with ARC, too). This is more of a problem with >> miscounting the number of hits based on accidents around how things >> are structured in the executor -- we need to be more careful about >> recognizing whether or not block hits are logically distinct. >> Particularly with stuff like nested loop joins. > > I agree that double-counting correlated accesses is a a problem, and I > agree that we probably want to do something about it. I am skeptical > that using wall-clock time is the right way to solve that problem > because it leads to edge effects I agree that wall-clock time is a bad approach, actually. If we were to use wall-clock time, we'd only do so because it can be used to discriminate against a thing that we actually care about in an approximate, indirect way. If there is a more direct way of identifying correlated accesses, which I gather that there is, then we should probably use it. -- Peter Geoghegan
pgsql-hackers by date: