Re: Load distributed checkpoint - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Load distributed checkpoint
Date
Msg-id 1167260097.3633.60.camel@silverbirch.site
Whole thread Raw
In response to Re: Load distributed checkpoint  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: Load distributed checkpoint
List pgsql-hackers
On Wed, 2006-12-27 at 23:26 +0100, Martijn van Oosterhout wrote:
> On Wed, Dec 27, 2006 at 09:24:06PM +0000, Simon Riggs wrote:
> > On Fri, 2006-12-22 at 13:53 -0500, Bruce Momjian wrote:
> > 
> > > I assume other kernels have similar I/O smoothing, so that data sent to
> > > the kernel via write() gets to disk within 30 seconds.  
> > > 
> > > I assume write() is not our checkpoint performance problem, but the
> > > transfer to disk via fsync().  
> > 
> > Well, its correct to say that the transfer to disk is the source of the
> > problem, but that doesn't only occur when we fsync(). There are actually
> > two disk storms that occur, because of the way the fs cache works. [Ron
> > referred to this effect uplist]
> 
> As someone looking from the outside:
> 
> fsync only works on one file, so presumably the checkpoint process is
> opening each file one by one and fsyncing them. 

Yes

> Does that make any
> difference here? Could you adjust the timing here?

Thats the hard bit with io storm 2. When you fsync a file you don't
actually know how many blocks you're writing, plus there's no way to
slow down those writes by putting delays between them (although its
possible your controller might know how to do this, I've never heard of
one that does).

If we put a delay after each fsync, that will space out all the ones
that don't need spacing out and do nothing to the ones that most need
it. Unfortunately.

IMHO there isn't any simple scheme that works all the time, for all OS
settings, default configurations and mechanisms.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com




pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: pg_hba.conf hostname todo
Next
From: Tom Lane
Date:
Subject: Re: TupleDescs and refcounts and such, again