Re: Experimental patch for inter-page delay in VACUUM - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: Experimental patch for inter-page delay in VACUUM
Date
Msg-id 3FAFA920.1050207@Yahoo.com
Whole thread Raw
In response to Re: Experimental patch for inter-page delay in VACUUM  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: Experimental patch for inter-page delay in VACUUM  (Bruce Momjian <pgman@candle.pha.pa.us>)
Re: Experimental patch for inter-page delay in VACUUM  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
Bruce Momjian wrote:

> Jan Wieck wrote:
>> Bruce Momjian wrote:
>> 
>> > Now, O_SYNC is going to force every write to the disk.  If we have a
>> > transaction that has to write lots of buffers (has to write them to
>> > reuse the shared buffer)
>> 
>> So make the background writer/checkpointer keeping the LRU head clean. I 
>> explained that 3 times now.
> 
> If the background cleaner has to not just write() but write/fsync or
> write/O_SYNC, it isn't going to be able to clean them fast enough.  It
> creates a bottleneck where we didn't have one before.
> 
> We are trying to eliminate an I/O storm during checkpoint, but the
> solutions seem to be making the non-checkpoint times slower.
> 

It looks as if you're assuming that I am making the backends unable to 
write on their own, so that they have to wait on the checkpointer. I 
never said that.

If the checkpointer keeps the LRU heads clean, that lifts off write load 
from the backends. Sure, they will be able to dirty pages faster. 
Theoretically, because in practice if you have a reasonably good cache 
hitrate, they will just find already dirty buffers where they just add 
some more dust.

If after all the checkpointer (doing write()+whateversync) is not able 
to keep up with the speed of buffers getting dirtied, the backends will 
have to do some write()'s again, because they will eat up the clean 
buffers at the LRU head and pass the checkpointer.

Also please notice another little change in behaviour. The old code just 
went through the buffer cache sequentially, possibly flushing buffers 
that got dirtied after the checkpoint started, which is way ahead of 
time (they need to be flushed for the next checkpoint, not now). That 
means, that if the same buffer gets dirtied again after that, we wasted 
a full disk write on it. My new code creates a list of dirty blocks at 
the beginning of the checkpoint, and flushes only those that are still 
dirty at the time it gets to them.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



pgsql-hackers by date:

Previous
From: "Marc G. Fournier"
Date:
Subject: RC2 tag'd and bundled ...
Next
From: Jan Wieck
Date:
Subject: Re: Experimental patch for inter-page delay in VACUUM