Re: Experimental ARC implementation - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: Experimental ARC implementation
Date
Msg-id 3FA956A1.3030308@Yahoo.com
Whole thread Raw
In response to Re: Experimental ARC implementation  ("Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at>)
Responses Re: Experimental ARC implementation
List pgsql-hackers
Zeugswetter Andreas SB SD wrote:
>> > Why not use the checkpointer itself inbetween checkpoints ?
>> > use a min and a max dirty setting like Informix. Start writing
>> > when more than max are dirty stop when at min. This avoids writing
>> > single pages (which is slow, since it cannot be grouped together
>> > by the OS).
>> 
>> Current approach is similar ... if I strech the IO and syncing over the 
>> entire 150-300 second checkpoint interval, grouping in 50 pages then 
>> sync()+nap, the system purr's pretty nice and without any peaks.
> 
> But how do you handle a write IO bound system then ? My thought was to 
> let the checkpointer write dirty pages inbetween checkpoints with a min max,
> but still try to do the checkpoint as fast as possible. I don't think
> streching the checkpoint is good, since it needs to write hot pages, which the 
> inbetween IO should avoid doing. The checkpointer would have two tasks,
> that it handles alternately, checkpoint or flush LRU from max to min.
> 
> Andreas

By actually moving a lot of the IO work into the checkpointer. It asks 
the buffer strategy about the order in which dirty blocks would 
currently get evicted from the cache. The checkpointer now flushes them 
in that order. Your "hot pages" will be found at the end of that list 
and thus flushed last in the checkpoint, why it's good to keep them 
dirty longer.

The problem with the checkpointer flushing as fast as possible is, that 
the entire system literally freezes. In my tests I use something that 
resembles the transaction profile of a TPC-C including the thinking and 
keying times. Those are important as they are a very realistic thing. A 
stock 7.4.RC1 handles a right scaled DB with new_order response times of 
0.2 to 1.5 seconds, but when the checkpoint occurs, it can't keep up and 
the response times go up to anything between 20-60 seconds. What makes 
the situation worse is that in the meantime, all simulated terminals hit 
the "send" button again, which lead's to a transaction pileup right 
during the checkpoint. It takes a while until the system recovers from 
that.

If the system is write-bound, the checkpointer will find that many dirty 
blocks that he has no time to nap and will burst them out as fast as 
possible anyway. Well, at least that's the theory.

PostgreSQL with the non-overwriting storage concept can never have 
hot-written pages for a long time anyway, can it? They fill up and cool 
down until vacuum.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



pgsql-hackers by date:

Previous
From: Manfred Spraul
Date:
Subject: Re: Performance features the 4th
Next
From: Neil Conway
Date:
Subject: Re: Performance features the 4th