Re: Controlling Load Distributed Checkpoints - Mailing list pgsql-hackers

From Greg Smith
Subject Re: Controlling Load Distributed Checkpoints
Date
Msg-id Pine.GSO.4.64.0706061328450.27416@westnet.com
Whole thread Raw
In response to Re: Controlling Load Distributed Checkpoints  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, 6 Jun 2007, Tom Lane wrote:

> If we don't know how to tune them, how will the users know?

I can tell you a good starting set for them to on a Linux system, but you 
first have to let me know how much memory is in the OS buffer cache, the 
typical I/O rate the disks can support, how many buffers are expected to 
be written out by BGW/other backends at heaviest load, and the current 
setting for /proc/sys/vm/dirty_background_ratio.  It's not a coincidence 
that there are patches applied to 8.3 or in the queue to measure all of 
the Postgres internals involved in that computation; I've been picking 
away at the edges of this problem.

Getting this sort of tuning right takes that level of information about 
the underlying system.  If there's a way to internally auto-tune the 
values this patch operates on (which I haven't found despite months of 
trying), it would be in the form of some sort of measurement/feedback loop 
based on how fast data is being written out.  There really are way too 
many things involved to try and tune it based on anything else; the 
underlying OS/hardware mechanisms that determine how this will go are 
complicated enough that it might as well be a black box for most people.

One of the things I've been fiddling with the design of is a testing 
program that simulates database activity at checkpoint time under load. 
I think running some tests like that is the most straightforward way to 
generate useful values for these tunables; it's much harder to try and 
determine them from within the backends because there's so much going on 
to keep track of.

I view the LDC mechanism as being in the same state right now as the 
background writer:  there are a lot of complicated knobs to tweak, they 
all do *something* useful for someone, and eliminating them will require a 
data-collection process across a much wider sample of data than can be 
collected quickly.  If I had to make a guess how this will end up, I'd 
expect there to be more knobs in LDC than everyone would like for the 8.3 
release, along with fairly verbose logging of what is happening at 
checkpoint time (that's why I've been nudging development in that area, 
along with making logs easier to aggregate).  Collect up enough of that 
information, then you're in a position to talk about useful automatic 
tuning--right around the 8.4 timeframe I suspect.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD


pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: Implicit casts with generic arrays
Next
From: Jeff Davis
Date:
Subject: Re: [RFC] GSoC Work on readonly queries done so far