Re: Experimental ARC implementation - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Experimental ARC implementation
Date
Msg-id 200311071559.hA7FxQJ07696@candle.pha.pa.us
Whole thread Raw
In response to Re: Experimental ARC implementation  (Jan Wieck <JanWieck@Yahoo.com>)
Responses Re: Experimental ARC implementation  (Jan Wieck <JanWieck@Yahoo.com>)
Re: Experimental ARC implementation  (Greg Stark <gsstark@mit.edu>)
List pgsql-hackers
Jan Wieck wrote:
> Bruce Momjian wrote:
> 
> > Jan Wieck wrote:
> >> If the system is write-bound, the checkpointer will find that many dirty 
> >> blocks that he has no time to nap and will burst them out as fast as 
> >> possible anyway. Well, at least that's the theory.
> >> 
> >> PostgreSQL with the non-overwriting storage concept can never have 
> >> hot-written pages for a long time anyway, can it? They fill up and cool 
> >> down until vacuum.
> > 
> > Another idea on removing sync() --- if we are going to use fsync() on
> > each file during checkpoint (open, fsync, close), seems we could keep a
> > hash of written block dbid/relfilenode pairs and cycle through that on
> > checkpoint.  We could keep the hash in shared memory, and dump it to a
> > backing store when it gets full, or just have it exist in buffer writer
> > process memory (so it can grow) and have backends that do their own
> > buffer writes all open a single file in append mode and write the pairs
> > to the file, or something like that, and the checkpoint process can read
> > from there.
> > 
> 
> I am not really aiming at removing sync() alltogether. We know already 
> that open,fsync,close does not guarantee you flush dirty OS-buffers for 
> which another process might so far only have done open,write. And you 

We do know this?  How?  I thought someone listed the standard saying it
should work.  I need this for Win32 anyway.

> really don't want to remove all the vfd logic or fsync on every write 
> done by a backend.

No, certainly not --- that is a big loser.

> What doing frequent fdatasync/fsync during a constant ongoing checkpoint 
> will cause is to significantly lower the physical write storm happening 
> at the sync(), which is causing huge problems right now.

I don't see that frankly because sync() is syncing everying on that
machine, including other file systems.  Reducing our own load from sync
will not help with other applications writing to drives.

> The reason why people blame vacuum that much is that not only does it 
> replaces the buffer cache with useless garbage, it also leaves that 
> garbage to be flushed by other backends or the checkpointer and it 
> rapidly fills WAL, causing exactly that checkpoint we don't have the IO 
> bandwidth for right now! They only see that vacuum is running, and if 
> they kill it the system returns to a healty state after a while ... easy 
> enought but only half the story.

Right, vacuum triggers WAL/checkpoint.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Information Schema and constraint names not unique
Next
From: Andrew Dunstan
Date:
Subject: Re: Information Schema and constraint names not unique