Thread: Re: Re: RC2 and open issues
Tom Lane <tgl@sss.pgh.pa.us> wrote on 21.12.2004, 05:05:36: > Bruce Momjian writes: > > I am confused. If we change the percentage to be X% of the entire > > buffer cache, and we set it to 1%, and we exit when either the dirty > > pages or % are reached, don't we end up just scanning the first 1% of > > the cache over and over again? > > Exactly. But 1% would be uselessly small with this definition. Offhand > I'd think something like 50% might be a starting point; maybe even more. > What that says is that a page isn't a candidate to be written out by the > bgwriter until it's fallen halfway down the LRU list. > I see the buffer list as a conveyor belt that carries unneeded blocks away from the MRU. Cleaning near the LRU (I agree: How near?) should be all that is sufficient to keep the list clean. Cleaning the first 1% "over and over again" makes it sound like it is the same list of blocks that are being cleaned. It may be the same linked list data structure, but that is dynamically changing to contain completely different blocks from the last time you looked. Best Regards, Simon Riggs
simon@2ndquadrant.com wrote: > Tom Lane <tgl@sss.pgh.pa.us> wrote on 21.12.2004, 05:05:36: > >>Bruce Momjian writes: >> >>>I am confused. If we change the percentage to be X% of the entire >>>buffer cache, and we set it to 1%, and we exit when either the dirty >>>pages or % are reached, don't we end up just scanning the first 1% of >>>the cache over and over again? >> >>Exactly. But 1% would be uselessly small with this definition. Offhand >>I'd think something like 50% might be a starting point; maybe even more. >>What that says is that a page isn't a candidate to be written out by the >>bgwriter until it's fallen halfway down the LRU list. >> > > > I see the buffer list as a conveyor belt that carries unneeded blocks > away from the MRU. Cleaning near the LRU (I agree: How near?) should be > all that is sufficient to keep the list clean. > > Cleaning the first 1% "over and over again" makes it sound like it is > the same list of blocks that are being cleaned. It may be the same > linked list data structure, but that is dynamically changing to contain > completely different blocks from the last time you looked. However, one thing you can say is that if block B hasn't been written to since you last checked, then any blocks older thanthat haven't been written to either. Of course, the problem is in finding block B again without re-scanning from the LRU end. Is there any non-intrusive way we could add a "bookmark" into the conveyer-belt? (mixing my metaphors again :-) Any blocks written to would move up the cache, effectively moving the bookmark lower. Enough activity would cause the bookmark to drop off the end. If that isn't the case though, we know we can safely skip any blocks older than the bookmark. -- Richard Huxton Archonet Ltd
Richard Huxton <dev@archonet.com> writes: > However, one thing you can say is that if block B hasn't been written to > since you last checked, then any blocks older than that haven't been > written to either. [ itch... ] Can you? I don't recall exactly when a block gets pushed up the ARC list during a ReadBuffer/WriteBuffer cycle, but at the very least I'd have to say that this assumption is vulnerable to race conditions. Also, the cntxDirty mechanism allows a block to be dirtied without changing the ARC state at all. I am not very clear on whether Vadim added that mechanism just for performance or because there were fundamental deadlock issues without it; but in either case we'd have to think long and hard about taking it out for the bgwriter's benefit. regards, tom lane
On Tue, Dec 21, 2004 at 10:26:48AM -0500, Tom Lane wrote: > Richard Huxton <dev@archonet.com> writes: > > However, one thing you can say is that if block B hasn't been written to > > since you last checked, then any blocks older than that haven't been > > written to either. > > [ itch... ] Can you? I don't recall exactly when a block gets pushed > up the ARC list during a ReadBuffer/WriteBuffer cycle, but at the very > least I'd have to say that this assumption is vulnerable to race > conditions. > > Also, the cntxDirty mechanism allows a block to be dirtied without > changing the ARC state at all. I am not very clear on whether Vadim > added that mechanism just for performance or because there were > fundamental deadlock issues without it; but in either case we'd have > to think long and hard about taking it out for the bgwriter's benefit. OTOH, ISTM that it's ok if the bgwriter occasionally misses blocks. These blocks would either result in a backend or the checkpointer having to write out a block (not so great), or the bgwriter could occasionally ignore it's bookmart and restart it's scan from the LRU. Of course I'm assuming that any race-conditions could be made to impact only the bgwriter and nothing else, which may be a bad assumption. -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
On Tue, 2004-12-21 at 15:26, Tom Lane wrote: > Richard Huxton <dev@archonet.com> writes: > > However, one thing you can say is that if block B hasn't been written to > > since you last checked, then any blocks older than that haven't been > > written to either. > > [ itch... ] Can you? I don't recall exactly when a block gets pushed > up the ARC list during a ReadBuffer/WriteBuffer cycle, but at the very > least I'd have to say that this assumption is vulnerable to race > conditions. > An intriguing idea: after some thought this would only be true if all block accesses were writes. A block can be re-read (but not written), causing it to move to the MRU of T2, thus moving it ahead of other dirty buffers. Forgive me: the conveyor belt analogy only applies when blocks on the buffer list haven't been touched *at all*. i.e. if they are hit only once (on T1) or twice (T2) they then just move down towards the LRU and roll off when they get there. -- Best Regards, Simon Riggs