Home > mailing lists

Re: checkpoint writeback via sync_file_range - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: checkpoint writeback via sync_file_range
Date	January 11, 2012 08:51:55
Msg-id	201201111351.38738.andres@anarazel.de Whole thread Raw
In response to	Re: checkpoint writeback via sync_file_range (Florian Weimer <fweimer@bfk.de>)
List	pgsql-hackers

Tree view

On Wednesday, January 11, 2012 10:33:47 AM Florian Weimer wrote:
> * Greg Smith:
> > One idea I was thinking about here was building a little hash table
> > inside of the fsync absorb code, tracking how many absorb operations
> > have happened for whatever the most popular relation files are.  The
> > idea is that we might say "use sync_file_range every time <N> calls
> > for a relation have come in", just to keep from ever accumulating too
> > many writes to any one file before trying to nudge some of it out of
> > there. The bat that keeps hitting me in the head here is that right
> > now, a single fsync might have a full 1GB of writes to flush out,
> > perhaps because it extended a table and then write more than that to
> > it.  And in everything but a SSD or giant SAN cache situation, 1GB of
> > I/O is just too much to fsync at a time without the OS choking a
> > little on it.
> 
> Isn't this pretty much like tuning vm.dirty_bytes?  We generally set it
> to pretty low values, and seems to help to smoothen the checkpoints.
If done correctly/way much more invasive you could only issue sync_file_range's 
to the areas of the file where checkpointing needs to happen and you could 
leave out e.g. hint bit only changes. Which could help to reduce the cost of 
checkpoints.

Andres

pgsql-hackers by date:

From: Andres Freund
Date: 11 January 2012, 08:48:00
Subject: Re: checkpoint writeback via sync_file_range

From: Andrew Dunstan
Date: 11 January 2012, 09:10:55
Subject: Re: JSON for PG 9.2

Re: checkpoint writeback via sync_file_range - Mailing list pgsql-hackers

Previous

Next