Re: Experimental ARC implementation - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: Experimental ARC implementation |
Date | |
Msg-id | 200311071559.hA7FxQJ07696@candle.pha.pa.us Whole thread Raw |
In response to | Re: Experimental ARC implementation (Jan Wieck <JanWieck@Yahoo.com>) |
Responses |
Re: Experimental ARC implementation
Re: Experimental ARC implementation |
List | pgsql-hackers |
Jan Wieck wrote: > Bruce Momjian wrote: > > > Jan Wieck wrote: > >> If the system is write-bound, the checkpointer will find that many dirty > >> blocks that he has no time to nap and will burst them out as fast as > >> possible anyway. Well, at least that's the theory. > >> > >> PostgreSQL with the non-overwriting storage concept can never have > >> hot-written pages for a long time anyway, can it? They fill up and cool > >> down until vacuum. > > > > Another idea on removing sync() --- if we are going to use fsync() on > > each file during checkpoint (open, fsync, close), seems we could keep a > > hash of written block dbid/relfilenode pairs and cycle through that on > > checkpoint. We could keep the hash in shared memory, and dump it to a > > backing store when it gets full, or just have it exist in buffer writer > > process memory (so it can grow) and have backends that do their own > > buffer writes all open a single file in append mode and write the pairs > > to the file, or something like that, and the checkpoint process can read > > from there. > > > > I am not really aiming at removing sync() alltogether. We know already > that open,fsync,close does not guarantee you flush dirty OS-buffers for > which another process might so far only have done open,write. And you We do know this? How? I thought someone listed the standard saying it should work. I need this for Win32 anyway. > really don't want to remove all the vfd logic or fsync on every write > done by a backend. No, certainly not --- that is a big loser. > What doing frequent fdatasync/fsync during a constant ongoing checkpoint > will cause is to significantly lower the physical write storm happening > at the sync(), which is causing huge problems right now. I don't see that frankly because sync() is syncing everying on that machine, including other file systems. Reducing our own load from sync will not help with other applications writing to drives. > The reason why people blame vacuum that much is that not only does it > replaces the buffer cache with useless garbage, it also leaves that > garbage to be flushed by other backends or the checkpointer and it > rapidly fills WAL, causing exactly that checkpoint we don't have the IO > bandwidth for right now! They only see that vacuum is running, and if > they kill it the system returns to a healty state after a while ... easy > enought but only half the story. Right, vacuum triggers WAL/checkpoint. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
pgsql-hackers by date: