Home > mailing lists

Re: Improvement of checkpoint IO scheduler for stable transaction responses - Mailing list pgsql-hackers

From	didier
Subject	Re: Improvement of checkpoint IO scheduler for stable transaction responses
Date	July 22, 2013 07:21:36
Msg-id	CAJRYxuJkYme5xKXW5M280yWQijOAM1uv0UDkxVryP-PUDn_0Og@mail.gmail.com Whole thread Raw
In response to	Re: Improvement of checkpoint IO scheduler for stable transaction responses (Greg Smith <greg@2ndQuadrant.com>)
List	pgsql-hackers

Tree view

Hi,

On Sat, Jul 20, 2013 at 6:28 PM, Greg Smith <greg@2ndquadrant.com> wrote:

On 7/20/13 4:48 AM, didier wrote:

That is the theory. In practice write caches are so large now, there is almost no pressure forcing writes to happen until the fsync calls show up. It's easily possible to enter the checkpoint fsync phase only to discover there are 4GB of dirty writes ahead of you, ones that have nothing to do with the checkpoint's I/O.

Isn't adding another layer of cache the usual answer?

The best would be in the OS, a fs with a big journal able to write sequentially a lot of blocks.

If not and If you can spare at worst 2bit in memory per data blocks, don't mind preallocated data files (assuming meta data are stable then) and have a working mmap( MAP_NONBLOCK), and mincore() syscalls you could have a checkpoint in bound time, worst case you sequentially write the whole server RAM to a separate disk every checkpoint.
Not sure I would trust such a beast with my data though :)

Didier

pgsql-hackers by date:

From: Quan Zongliang
Date: 22 July 2013, 07:17:51
Subject: improve Chinese locale performance

From: Craig Ringer
Date: 22 July 2013, 10:02:15
Subject: Re: Wal sync odirect

Re: Improvement of checkpoint IO scheduler for stable transaction responses - Mailing list pgsql-hackers

Previous

Next