Re: checkpoint patches - Mailing list pgsql-hackers

From Greg Smith
Subject Re: checkpoint patches
Date
Msg-id 4F7BCE72.5020801@2ndQuadrant.com
Whole thread Raw
In response to Re: checkpoint patches  (Jim Nasby <jim@nasby.net>)
Responses Re: checkpoint patches  (Jim Nasby <jim@nasby.net>)
List pgsql-hackers
On 03/25/2012 04:29 PM, Jim Nasby wrote:
> Another $0.02: I don't recall the community using pg_bench much at all
> to measure latency... I believe it's something fairly new. I point
> this out because I believe there are differences in analysis that you
> need to do for TPS vs latency. I think Robert's graphs support my
> argument; the numeric X-percentile data might not look terribly good,
> but reducing peak latency from 100ms to 60ms could be a really big
> deal on a lot of systems. My intuition is that one or both of these
> patches actually would be valuable in the real world; it would be a
> shame to throw them out because we're not sure how to performance test
> them...

One of these patches is already valuable in the real world.  There it
will stay, while we continue mining it for nuggets of deeper insight
into the problem that can lead into a better test case.

Starting at pgbench latency worked out fairly well for some things.
Last year around this time I published some results I summarized at
http://blog.2ndquadrant.com/en/gregs-planetpostgresql/2011/02/ , which
included things like worst-case latency going from <=34 seconds on ext3
to <=5 seconds on xfs.

The problem I keep hitting now is that 2 to 5 second latencies on Linux
are extremely hard to get rid of if you overwhelm storage--any storage.
That's where the wall is, where if you try to drive them lower than that
you pay some hard trade-off penalties, if it works at all.

Take a look at the graph I've attached.  That's a slow drive not able to
keep up with lots of random writes stalling, right?  No.  It's a
Fusion-io card that will do 600MB/s of random I/O.  But clog it up with
an endless stream of pgbench writes, never with any pause to catch up,
and I can get Linux to clog it for many seconds whenever I set it loose.

This test workload is so not representative of the real world that I
don't think we should be committing things justified by it, unless they
are uncontested wins.  And those aren't so easy to find on the write
side of things.

Thanks to Robert for shaking my poorly submitted patch and seeing what
happened.  I threw mine out in hopes that some larger checkpoint patch
shoot-out might find it useful.  Didn't happen, sorry I didn't get to
looking more at the other horses.  I do have some more neat benchmarks
to share though

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


Attachment

pgsql-hackers by date:

Previous
From: Greg Smith
Date:
Subject: Re: performance-test farm
Next
From: Josh Kupershmidt
Date:
Subject: psql: tab completions for 'WITH'