Re: Group commit, revised - Mailing list pgsql-hackers

From Greg Smith
Subject Re: Group commit, revised
Date
Msg-id 4F25B807.5020103@2ndQuadrant.com
Whole thread Raw
In response to Re: Group commit, revised  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: Group commit, revised  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
On 01/28/2012 07:48 PM, Jeff Janes wrote:
> Others are going to test this out on high-end systems. I wanted to
> try it out on the other end of the scale.  I've used a Pentium 4,
> 3.2GHz,
> with 2GB of RAM and with a single IDE drive running ext4.  ext4 is
> amazingly bad on IDE, giving about 25 fsync's per second (and it lies
> about fdatasync, but apparently not about fsync)

Fantastic, I had to stop for a minute to check the date on your message 
for a second there, make sure it hadn't come from some mail server 
that's been backed up on delivery the last five years.  I'm cleaning 
house toward testing this out here, and I was going to test on the same 
system using both fast and horribly slow drives.  Both ends of the scale 
are important, and they benefit in a very different way from these changes.

> I haven't inspected that deep fall off at 30 clients for the patch.
> By way of reference, if I turn off synchronous commit, I get
> tps=1245.8 which is 100% CPU limited.  This sets an theoretical upper
> bound on what could be achieved by the best possible group committing
> method.

This sort of thing is why I suspect that to completely isolate some 
results, we're going to need a moderately high end server--with lots of 
cores--combined with an intentionally mismatched slow drive.  It's too 
easy to get pgbench and/or PostgreSQL to choke on something other than 
I/O when using smaller core counts.  I don't think I have anything where 
the floor is 24 TPS per client though.  Hmmm...I think I can connect an 
IDE drive to my MythTV box and format it with ext4.  Thanks for the test 
idea.

One thing you could try on this system is using the -N "Do not update 
pgbench_tellers and pgbench_branches".  That eliminates a lot of the 
contention that might be pulling down your higher core count tests, 
while still giving a completely valid test of whether the group commit 
mechanism works.  Not sure whether that will push up the top-end 
usefully for you, worth a try if you have time to test again.

> If the group_commit patch goes in, would we then rip out commit_delay
> and commit_siblings?

The main reason those are still hanging around at all are to allow 
pushing on the latency vs. throughput trade-off on really busy systems.  
The use case is that you expect, say, 10 clients to constantly be 
committing at a high rate.  So if there's just one committing so far, 
assume it's the leading edge of a wave and pause a bit for the rest to 
come in.  I don't think the cases where this is useful behavior--people 
both want it and the current mechanism provides it--are very common in 
the real world.  It can be useful for throughput oriented benchmarks 
though, which is why I'd say it hasn't killed off yet.

We'll have to see whether the final form this makes sense in will 
usefully replace that sort of thing.  I'd certainly be in favor of 
nuking commit_delay and commit_siblings with a better option; it would 
be nice if we don't eliminate this tuning option in the process though.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: CLOG contention, part 2
Next
From: Jeff Davis
Date:
Subject: Re: GiST for range types (was Re: Range Types - typo + NULL string constructor)