Thread: Re: Re: RC2 and open issues

Re: Re: RC2 and open issues

From

Date:

21 December 2004, 09:50:25

Tom Lane <tgl@sss.pgh.pa.us> wrote on 21.12.2004, 05:05:36:
> Bruce Momjian  writes:
> > I am confused.  If we change the percentage to be X% of the entire
> > buffer cache, and we set it to 1%, and we exit when either the dirty
> > pages or % are reached, don't we end up just scanning the first 1% of
> > the cache over and over again?
>
> Exactly.  But 1% would be uselessly small with this definition.  Offhand
> I'd think something like 50% might be a starting point; maybe even more.
> What that says is that a page isn't a candidate to be written out by the
> bgwriter until it's fallen halfway down the LRU list.
>

I see the buffer list as a conveyor belt that carries unneeded blocks
away from the MRU. Cleaning near the LRU (I agree: How near?) should be
all that is sufficient to keep the list clean.

Cleaning the first 1% "over and over again" makes it sound like it is
the same list of blocks that are being cleaned. It may be the same
linked list data structure, but that is dynamically changing to contain
completely different blocks from the last time you looked.

Best Regards, Simon Riggs

Re: RC2 and open issues

From

Richard Huxton

Date:

21 December 2004, 12:23:25

simon@2ndquadrant.com wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote on 21.12.2004, 05:05:36:
> 
>>Bruce Momjian  writes:
>>
>>>I am confused.  If we change the percentage to be X% of the entire
>>>buffer cache, and we set it to 1%, and we exit when either the dirty
>>>pages or % are reached, don't we end up just scanning the first 1% of
>>>the cache over and over again?
>>
>>Exactly.  But 1% would be uselessly small with this definition.  Offhand
>>I'd think something like 50% might be a starting point; maybe even more.
>>What that says is that a page isn't a candidate to be written out by the
>>bgwriter until it's fallen halfway down the LRU list.
>>
> 
> 
> I see the buffer list as a conveyor belt that carries unneeded blocks
> away from the MRU. Cleaning near the LRU (I agree: How near?) should be
> all that is sufficient to keep the list clean.
> 
> Cleaning the first 1% "over and over again" makes it sound like it is
> the same list of blocks that are being cleaned. It may be the same
> linked list data structure, but that is dynamically changing to contain
> completely different blocks from the last time you looked.

However, one thing you can say is that if block B hasn't been written to  since you last checked, then any blocks older
thanthat haven't been 
 
written to either. Of course, the problem is in finding block B again 
without re-scanning from the LRU end.

Is there any non-intrusive way we could add a "bookmark" into the 
conveyer-belt? (mixing my metaphors again :-) Any blocks written to 
would move up the cache, effectively moving the bookmark lower. Enough 
activity would cause the bookmark to drop off the end. If that isn't the 
case though, we know we can safely skip any blocks older than the bookmark.

--  Richard Huxton  Archonet Ltd

Re: RC2 and open issues

From

Tom Lane

Date:

21 December 2004, 15:27:05

Richard Huxton <dev@archonet.com> writes:
> However, one thing you can say is that if block B hasn't been written to 
> since you last checked, then any blocks older than that haven't been 
> written to either.

[ itch... ]  Can you?  I don't recall exactly when a block gets pushed
up the ARC list during a ReadBuffer/WriteBuffer cycle, but at the very
least I'd have to say that this assumption is vulnerable to race
conditions.

Also, the cntxDirty mechanism allows a block to be dirtied without
changing the ARC state at all.  I am not very clear on whether Vadim
added that mechanism just for performance or because there were
fundamental deadlock issues without it; but in either case we'd have
to think long and hard about taking it out for the bgwriter's benefit.
        regards, tom lane

Re: RC2 and open issues

From

"Jim C. Nasby"

Date:

21 December 2004, 22:25:46

On Tue, Dec 21, 2004 at 10:26:48AM -0500, Tom Lane wrote:
> Richard Huxton <dev@archonet.com> writes:
> > However, one thing you can say is that if block B hasn't been written to 
> > since you last checked, then any blocks older than that haven't been 
> > written to either.
> 
> [ itch... ]  Can you?  I don't recall exactly when a block gets pushed
> up the ARC list during a ReadBuffer/WriteBuffer cycle, but at the very
> least I'd have to say that this assumption is vulnerable to race
> conditions.
> 
> Also, the cntxDirty mechanism allows a block to be dirtied without
> changing the ARC state at all.  I am not very clear on whether Vadim
> added that mechanism just for performance or because there were
> fundamental deadlock issues without it; but in either case we'd have
> to think long and hard about taking it out for the bgwriter's benefit.

OTOH, ISTM that it's ok if the bgwriter occasionally misses blocks.
These blocks would either result in a backend or the checkpointer having
to write out a block (not so great), or the bgwriter could occasionally
ignore it's bookmart and restart it's scan from the LRU.

Of course I'm assuming that any race-conditions could be made to impact
only the bgwriter and nothing else, which may be a bad assumption.
-- 
Jim C. Nasby, Database Consultant               decibel@decibel.org 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

Re: RC2 and open issues

From

Simon Riggs

Date:

23 December 2004, 00:10:11

On Tue, 2004-12-21 at 15:26, Tom Lane wrote:
> Richard Huxton <dev@archonet.com> writes:
> > However, one thing you can say is that if block B hasn't been written to 
> > since you last checked, then any blocks older than that haven't been 
> > written to either.
> 
> [ itch... ]  Can you?  I don't recall exactly when a block gets pushed
> up the ARC list during a ReadBuffer/WriteBuffer cycle, but at the very
> least I'd have to say that this assumption is vulnerable to race
> conditions.
> 

An intriguing idea: after some thought this would only be true if all
block accesses were writes. A block can be re-read (but not written),
causing it to move to the MRU of T2, thus moving it ahead of other dirty
buffers.

Forgive me: the conveyor belt analogy only applies when blocks on the
buffer list haven't been touched *at all*. i.e. if they are hit only
once (on T1) or twice (T2) they then just move down towards the LRU and
roll off when they get there.

-- 
Best Regards, Simon Riggs