Re: checkpoint writeback via sync_file_range - Mailing list pgsql-hackers

From Florian Weimer
Subject Re: checkpoint writeback via sync_file_range
Date
Msg-id 82mx9u4m84.fsf@mid.bfk.de
Whole thread Raw
In response to Re: checkpoint writeback via sync_file_range  (Greg Smith <greg@2ndQuadrant.com>)
Responses Re: checkpoint writeback via sync_file_range  (Andres Freund <andres@anarazel.de>)
Re: checkpoint writeback via sync_file_range  (Greg Smith <greg@2ndQuadrant.com>)
List pgsql-hackers
* Greg Smith:

> One idea I was thinking about here was building a little hash table
> inside of the fsync absorb code, tracking how many absorb operations
> have happened for whatever the most popular relation files are.  The
> idea is that we might say "use sync_file_range every time <N> calls
> for a relation have come in", just to keep from ever accumulating too
> many writes to any one file before trying to nudge some of it out of
> there. The bat that keeps hitting me in the head here is that right
> now, a single fsync might have a full 1GB of writes to flush out,
> perhaps because it extended a table and then write more than that to
> it.  And in everything but a SSD or giant SAN cache situation, 1GB of
> I/O is just too much to fsync at a time without the OS choking a
> little on it.

Isn't this pretty much like tuning vm.dirty_bytes?  We generally set it
to pretty low values, and seems to help to smoothen the checkpoints.

--
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstraße 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: checkpoint writeback via sync_file_range
Next
From: Simon Riggs
Date:
Subject: Re: log messages for archive recovery progress