Re: Experimental ARC implementation - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: Experimental ARC implementation
Date
Msg-id 3FABBE2E.5060902@Yahoo.com
Whole thread Raw
In response to Re: Experimental ARC implementation  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: Experimental ARC implementation  (Bruce Momjian <pgman@candle.pha.pa.us>)
Re: Experimental ARC implementation  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Bruce Momjian wrote:

> Jan Wieck wrote:
>> If the system is write-bound, the checkpointer will find that many dirty 
>> blocks that he has no time to nap and will burst them out as fast as 
>> possible anyway. Well, at least that's the theory.
>> 
>> PostgreSQL with the non-overwriting storage concept can never have 
>> hot-written pages for a long time anyway, can it? They fill up and cool 
>> down until vacuum.
> 
> Another idea on removing sync() --- if we are going to use fsync() on
> each file during checkpoint (open, fsync, close), seems we could keep a
> hash of written block dbid/relfilenode pairs and cycle through that on
> checkpoint.  We could keep the hash in shared memory, and dump it to a
> backing store when it gets full, or just have it exist in buffer writer
> process memory (so it can grow) and have backends that do their own
> buffer writes all open a single file in append mode and write the pairs
> to the file, or something like that, and the checkpoint process can read
> from there.
> 

I am not really aiming at removing sync() alltogether. We know already 
that open,fsync,close does not guarantee you flush dirty OS-buffers for 
which another process might so far only have done open,write. And you 
really don't want to remove all the vfd logic or fsync on every write 
done by a backend.

What doing frequent fdatasync/fsync during a constant ongoing checkpoint 
will cause is to significantly lower the physical write storm happening 
at the sync(), which is causing huge problems right now.

The reason why people blame vacuum that much is that not only does it 
replaces the buffer cache with useless garbage, it also leaves that 
garbage to be flushed by other backends or the checkpointer and it 
rapidly fills WAL, causing exactly that checkpoint we don't have the IO 
bandwidth for right now! They only see that vacuum is running, and if 
they kill it the system returns to a healty state after a while ... easy 
enought but only half the story.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [pgsql-advocacy] Changes to Contributor List
Next
From: Jan Wieck
Date:
Subject: Re: Performance features the 4th