Re: Improvement of checkpoint IO scheduler for stable transaction responses - Mailing list pgsql-hackers

From didier
Subject Re: Improvement of checkpoint IO scheduler for stable transaction responses
Date
Msg-id CAJRYxuJkYme5xKXW5M280yWQijOAM1uv0UDkxVryP-PUDn_0Og@mail.gmail.com
Whole thread Raw
In response to Re: Improvement of checkpoint IO scheduler for stable transaction responses  (Greg Smith <greg@2ndQuadrant.com>)
List pgsql-hackers
Hi,

On Sat, Jul 20, 2013 at 6:28 PM, Greg Smith <greg@2ndquadrant.com> wrote:
On 7/20/13 4:48 AM, didier wrote:

That is the theory.  In practice write caches are so large now, there is almost no pressure forcing writes to happen until the fsync calls show up.  It's easily possible to enter the checkpoint fsync phase only to discover there are 4GB of dirty writes ahead of you, ones that have nothing to do with the checkpoint's I/O.

Isn't adding another layer of cache the usual answer?

The best would be in the OS, a fs with a big journal able to write sequentially a lot of blocks.

If not and If you can spare at worst 2bit in memory per data blocks,  don't mind preallocated data files (assuming meta data are stable then) and have a working mmap(  MAP_NONBLOCK), and mincore() syscalls you could have a checkpoint in bound time, worst case you sequentially write the whole server RAM to a separate disk every checkpoint.
Not sure I would trust such a beast with my data though :)

 
Didier

pgsql-hackers by date:

Previous
From: Quan Zongliang
Date:
Subject: improve Chinese locale performance
Next
From: Craig Ringer
Date:
Subject: Re: Wal sync odirect