Re: checkpoint patches - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: checkpoint patches
Date
Msg-id 4F6F800D.8000808@nasby.net
Whole thread Raw
In response to Re: checkpoint patches  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: checkpoint patches  (Robert Haas <robertmhaas@gmail.com>)
Re: checkpoint patches  (Greg Smith <greg@2ndQuadrant.com>)
List pgsql-hackers
On 3/23/12 7:38 AM, Robert Haas wrote:
> And here are the latency results for 95th-100th percentile with
> checkpoint_timeout=16min.
>
> ckpt.master.13: 1703, 1830, 2166, 17953, 192434, 43946669
> ckpt.master.14: 1728, 1858, 2169, 15596, 187943, 9619191
> ckpt.master.15: 1700, 1835, 2189, 22181, 206445, 8212125
>
> The picture looks similar here.  Increasing checkpoint_timeout isn't
> *quite*  as good as spreading out the fsyncs, but it's pretty darn
> close.  For example, looking at the median of the three 98th
> percentile numbers for each configuration, the patch bought us a 28%
> improvement in 98th percentile latency.  But increasing
> checkpoint_timeout by a minute bought us a 15% improvement in 98th
> percentile latency.  So it's still not clear to me that the patch is
> doing anything on this test that you couldn't get just by increasing
> checkpoint_timeout by a few more minutes.  Granted, it lets you keep
> your inter-checkpoint interval slightly smaller, but that's not that
> exciting.  That having been said, I don't have a whole lot of trouble
> believing that there are other cases where this is more worthwhile.

I wouldn't be too quick to dismiss increasing checkpoint frequency (ie: decreasing checkpoint_timeout).

On a high-value production system you're going to care quite a bit about recovery time. I certainly wouldn't want to
runour systems with checkpoint_timeout='15 min' if I could avoid it.
 

Another $0.02: I don't recall the community using pg_bench much at all to measure latency... I believe it's something
fairlynew. I point this out because I believe there are differences in analysis that you need to do for TPS vs latency.
Ithink Robert's graphs support my argument; the numeric X-percentile data might not look terribly good, but reducing
peaklatency from 100ms to 60ms could be a really big deal on a lot of systems. My intuition is that one or both of
thesepatches actually would be valuable in the real world; it would be a shame to throw them out because we're not sure
howto performance test them...
 
-- 
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net


pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: who's familiar with the GSOC application process
Next
From: Jim Nasby
Date:
Subject: Re: COPY / extend ExclusiveLock