Home > mailing lists

Re: Improvement of checkpoint IO scheduler for stable transaction responses - Mailing list pgsql-hackers

From	Greg Smith
Subject	Re: Improvement of checkpoint IO scheduler for stable transaction responses
Date	July 14, 2013 19:13:57
Msg-id	51E2F865.8030406@2ndQuadrant.com Whole thread Raw
In response to	Re: Improvement of checkpoint IO scheduler for stable transaction responses (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Improvement of checkpoint IO scheduler for stable transaction responses Re: Improvement of checkpoint IO scheduler for stable transaction responses Re: Improvement of checkpoint IO scheduler for stable transaction responses
List	pgsql-hackers

Tree view

On 6/27/13 11:08 AM, Robert Haas wrote:
> I'm pretty sure Greg Smith tried it the fixed-sleep thing before and
> it didn't work that well.

That's correct, I spent about a year whipping that particular horse and 
submitted improvements on it to the community. 
http://www.postgresql.org/message-id/4D4F9A3D.5070700@2ndquadrant.com 
and its updates downthread are good ones to compare this current work 
against.

The important thing to realize about just delaying fsync calls is that 
it *cannot* increase TPS throughput.  Not possible in theory, obviously 
doesn't happen in practice.  The most efficient way to write things out 
is to delay those writes as long as possible.  The longer you postpone a 
write, the more elevator sorting and write combining you get out of the 
OS.  This is why operating systems like Linux come tuned for such 
delayed writes in the first place.  Throughput and latency are linked; 
any patch that aims to decrease latency will probably slow throughput.

Accordingly, the current behavior--no delay--is already the best 
possible throughput.  If you apply a write timing change and it seems to 
increase TPS, that's almost certainly because it executed less 
checkpoint writes.  It's not a fair comparison.  You have to adjust any 
delaying to still hit the same end point on the checkpoint schedule. 
That's what my later submissions did, and under that sort of controlled 
condition most of the improvements went away.

Now, I still do really believe that better spacing of fsync calls helps 
latency in the real world.  Far as I know the server that I developed 
that patch for originally in 2010 is still running with that change. 
The result is not a throughput change though; there is a throughput drop 
with a latency improvement.  That is the unbreakable trade-off in this 
area if all you touch is scheduling.

The reason why I was ignoring this discussion and working on pgbench 
throttling until now is that you need to measure latency at a constant 
throughput to advance here on this topic, and that's exactly what the 
new pgbench feature enables.  If we can take the current checkpoint 
scheduler and an altered one, run both at exactly the same rate, and one 
gives lower latency, now we're onto something.  It's possible to do that 
with DBT-2 as well, but I wanted something really simple that people 
could replicate results with in pgbench.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

pgsql-hackers by date:

From: Fabien COELHO
Date: 14 July 2013, 18:48:31
Subject: Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement)

From: Greg Smith
Date: 14 July 2013, 19:14:56
Subject: Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement)

Re: Improvement of checkpoint IO scheduler for stable transaction responses - Mailing list pgsql-hackers

Previous

Next