Re: Group commit, revised - Mailing list pgsql-hackers
From | Jeff Janes |
---|---|
Subject | Re: Group commit, revised |
Date | |
Msg-id | CAMkU=1whmdC+o7RP-Lc3Gd_7OS2LM4S6Ltk5jc2ZrsNjvL4gvg@mail.gmail.com Whole thread Raw |
In response to | Re: Group commit, revised (Greg Smith <greg@2ndQuadrant.com>) |
List | pgsql-hackers |
On Sun, Jan 29, 2012 at 1:20 PM, Greg Smith <greg@2ndquadrant.com> wrote: > On 01/28/2012 07:48 PM, Jeff Janes wrote: >> > >> I haven't inspected that deep fall off at 30 clients for the patch. >> By way of reference, if I turn off synchronous commit, I get >> tps=1245.8 which is 100% CPU limited. This sets an theoretical upper >> bound on what could be achieved by the best possible group committing >> method. > > > This sort of thing is why I suspect that to completely isolate some results, > we're going to need a moderately high end server--with lots of > cores--combined with an intentionally mismatched slow drive. It's too easy > to get pgbench and/or PostgreSQL to choke on something other than I/O when > using smaller core counts. I don't think I have anything where the floor is > 24 TPS per client though. Hmmm...I think I can connect an IDE drive to my > MythTV box and format it with ext4. Thanks for the test idea. > > One thing you could try on this system is using the -N "Do not update > pgbench_tellers and pgbench_branches". That eliminates a lot of the > contention that might be pulling down your higher core count tests, while > still giving a completely valid test of whether the group commit mechanism > works. Not sure whether that will push up the top-end usefully for you, > worth a try if you have time to test again. Adding the -N did eliminate the fall-off at 30 clients for group_commit patch. But, I still want to explore why the fall off occurs when I get a chance. I know why the curve would stop going up without using -N (with -s of 40 and -c of 30, many connections will be waiting on row locks for updates to pgbench_branches) but that should cause a leveling off, not a collapse. Other than the lack of drop off at 30 clients, -N didn't meaningfully change anything. Everyone got slightly faster except at -c1. >> If the group_commit patch goes in, would we then rip out commit_delay >> and commit_siblings? > > > The main reason those are still hanging around at all are to allow pushing > on the latency vs. throughput trade-off on really busy systems. The use > case is that you expect, say, 10 clients to constantly be committing at a > high rate. So if there's just one committing so far, assume it's the > leading edge of a wave and pause a bit for the rest to come in. I don't > think the cases where this is useful behavior--people both want it and the > current mechanism provides it--are very common in the real world. The tests I did are exactly that environment where commit_delay might be expected to help. And it did help, but just not all that much. One of the problems is that while it does wait for those others to come in and then it does flush them in one fsync; but often the others never get woken up successfully to realize that they have already been flushed. They continue to block. The group_commit patch, on the other hand, accomplishes exactly what commit_delay was intended to accomplish but doesn't do a very good job of. With the -N option, I also used commit_delay on top of group_commit, and the difference between the two look like it was within the margin of error. So commit_delay did not obviously cause further improvement. > It can be > useful for throughput oriented benchmarks though, which is why I'd say it > hasn't killed off yet. > > We'll have to see whether the final form this makes sense in will usefully > replace that sort of thing. I'd certainly be in favor of nuking > commit_delay and commit_siblings with a better option; it would be nice if > we don't eliminate this tuning option in the process though. But I'm pretty sure that group_commit has stolen that thunder. Obviously a few benchmarks on one system isn't enough to prove that, though. The only use case I see left for commit_delay is where it is set on a per-connection basis rather than system-wide. Once you start a fsync, everyone who missed the bus is locked out until the next one. So low-priority connections can set commit_delay so as not to trigger the bus to leave before the high priority process gets on. But that seems like a pretty tenuous use case with better ways to do it. Cheers, Jeff
pgsql-hackers by date: