Re: checkpoint writeback via sync_file_range - Mailing list pgsql-hackers

From Greg Smith
Subject Re: checkpoint writeback via sync_file_range
Date
Msg-id 4F0D98CE.4000607@2ndQuadrant.com
Whole thread Raw
In response to Re: checkpoint writeback via sync_file_range  (Florian Weimer <fweimer@bfk.de>)
List pgsql-hackers
On 1/11/12 4:33 AM, Florian Weimer wrote:
> Isn't this pretty much like tuning vm.dirty_bytes?  We generally set it
> to pretty low values, and seems to help to smoothen the checkpoints.

When I experimented with dropping the actual size of the cache, 
checkpoint spikes improved, but things like VACUUM ran terribly slow. 
On a typical medium to large server nowadays (let's say 16GB+), 
PostgreSQL needs to have gigabytes of write cache for good performance.

What we're aiming to here is keep the benefits of having that much write 
cache, while allowing checkpoint related work to send increasingly 
strong suggestions about ordering what it needs written soon.  There's 
basically three primary states on Linux to be concerned about here:

Dirty:  in the cache via standard write
|
v  pdflush does writeback at 5 or 10% dirty || sync_file_range push
|
Writeback
|
v  write happens in the background || fsync call
|
Stored on disk

The systems with bad checkpoint problems will typically have gigabytes 
"Dirty", which is necessary for good performance.  It's very lazy about 
pushing things toward "Writeback" though.  Getting the oldest portions 
of the outstanding writes into the Writeback queue more aggressively 
should make the eventual fsync less likely to block.


-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.
Next
From: Greg Smith
Date:
Subject: Re: checkpoint writeback via sync_file_range