Thread: 8.3 synchronous_commit
I might completely misunderstand this feature. Shouldn't "synchronous_commit = off" improve performance? Whatever I do, I find "synchronous_commit = off" to degrade performance. Especially it doesn't like the CFQ I/O scheduler, it's not so bad with deadline. Synthetic load like pgbench -i -s 10 -U pgsql -d bench && pgbench -t 1000 -c 100 -U pgsql -d bench or the same with scale 100. Maybe it's just my test box.. single SATA-II drive, XFS on top of LVM. I'll retry without LVM once I have another drive.. I've seen LVM mess with other things in the past. -- Best regards, Hannes Dorbath
Hello synchronous_commit = off is well for specific load, try only one connect pgbench ~ it is analogy for database import or some administrator's work. Regards Pavel Stehule On 21/01/2008, Hannes Dorbath <light@theendofthetunnel.de> wrote: > I might completely misunderstand this feature. Shouldn't > "synchronous_commit = off" improve performance? > > Whatever I do, I find "synchronous_commit = off" to degrade performance. > > Especially it doesn't like the CFQ I/O scheduler, it's not so bad with > deadline. Synthetic load like > > pgbench -i -s 10 -U pgsql -d bench && pgbench -t 1000 -c 100 -U pgsql -d > bench > > or the same with scale 100. > > Maybe it's just my test box.. single SATA-II drive, XFS on top of LVM. > > I'll retry without LVM once I have another drive.. I've seen LVM mess > with other things in the past. > > > -- > Best regards, > Hannes Dorbath > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq >
On Mon, 2008-01-21 at 22:55 +0100, Hannes Dorbath wrote: > I might completely misunderstand this feature. Shouldn't > "synchronous_commit = off" improve performance? > > Whatever I do, I find "synchronous_commit = off" to degrade performance. > > Especially it doesn't like the CFQ I/O scheduler, it's not so bad with > deadline. Synthetic load like > The CFQ scheduler is bad for performance in the tests that I have run. When I have a chance I'll put together some tests to try to demonstrate that. The reason it may be bad in your case is if you have many backends commit many transactions asynchronously, and then the WAL writer tries to make those transactions durable, CFQ might think that the WAL writer is "unfairly" using a lot of I/O. This is just speculation though. Regards, Jeff Davis
On Mon, 21 Jan 2008, Hannes Dorbath wrote: > pgbench -i -s 10 -U pgsql -d bench && pgbench -t 1000 -c 100 -U pgsql -d > bench pgbench doesn't handle 100 clients at once very well on the same box as the server, unless you have a pretty serious system. The pgbench program itself has a single process model that doesn't handle the CFQ round-robin very well at all. On top of that, the database scale should be bigger than the number of clients or everybody just fights for the branches table. You said you tried with a larger scale as well, but that also vastly increases the size of the database which shifts to a completely different set of bottlenecks. See http://www.westnet.com/~gsmith/content/postgresql/pgbench-scaling.htm for more on this. Try something more in the range of 4 clients/CPU and set the scale to closer to twice that (so with a dual-core system you might do 8 clients and a scale of 16). If you really want to simulate a large number of clients, do that on another system and connect to the server remotely. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Mon, 2008-01-21 at 18:31 -0500, Greg Smith wrote: > pgbench doesn't handle 100 clients at once very well on the same box as > the server, unless you have a pretty serious system. The pgbench program > itself has a single process model that doesn't handle the CFQ round-robin > very well at all. On top of that, the database scale should be bigger He was referring to the CFQ I/O scheduler. I don't think that will affect pgbench itself, because it doesn't read/write to disk, right? Regards, Jeff Davis
On Mon, 21 Jan 2008, Jeff Davis wrote: > He was referring to the CFQ I/O scheduler. I don't think that will > affect pgbench itself, because it doesn't read/write to disk, right? It does if you are writing latency log files but it shouldn't be in the cases given. But there's something weird about the interplay between the server disk I/O under CFQ and the time slices the pgbench client gets when they're all on the same system. pgbench is certainly a badly scaling (in regards to how it handles simulating many clients) single process utility and anything that starves its execution sometimes is deadly. I never dug deep into the exact scheduler issues, but regardless in this case it's kind of unrealistic for Hannes to expect to simulate 100 clients on a small system and still get a true view of how the server on that system performs. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Em Mon, 21 Jan 2008 15:45:54 -0800 Jeff Davis <pgsql@j-davis.com> escreveu: > On Mon, 2008-01-21 at 18:31 -0500, Greg Smith wrote: > > pgbench doesn't handle 100 clients at once very well on the same > > box as the server, unless you have a pretty serious system. The > > pgbench program itself has a single process model that doesn't > > handle the CFQ round-robin very well at all. On top of that, the > > database scale should be bigger > > He was referring to the CFQ I/O scheduler. I don't think that will > affect pgbench itself, because it doesn't read/write to disk, right? > No. But pgbench local running, it will be concurrency with PostgreSQL. I'm realized some test and really confirm disable *synchronous_commit* performance degrade with CFQ and Deadline. Kind Regards, -- Fernando Ike http://www.midstorm.org/~fike/weblog
* Hannes Dorbath: > I might completely misunderstand this feature. Shouldn't > "synchronous_commit = off" improve performance? Indeed. We've seen something similar in one test, but couldn't reproduce it in a clean environment so far. > Maybe it's just my test box.. single SATA-II drive, XFS on top of LVM. Ours was ext3, no LVM or RAID. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
On Jan 22, 2008 9:32 AM, Florian Weimer <fweimer@bfk.de> wrote: > > Maybe it's just my test box.. single SATA-II drive, XFS on top of LVM. > > Ours was ext3, no LVM or RAID. Also with SATA? If your SATA disk is lying about effectively SYNCing the data, I'm not that surprised you don't see any improvement. Being slower is a bit surprising though. -- Guillaume
* Guillaume Smet: > On Jan 22, 2008 9:32 AM, Florian Weimer <fweimer@bfk.de> wrote: >> > Maybe it's just my test box.. single SATA-II drive, XFS on top of LVM. >> >> Ours was ext3, no LVM or RAID. > > Also with SATA? Yes, desktop-class SATA. > If your SATA disk is lying about effectively SYNCing the data, I'm > not that surprised you don't see any improvement. It still should be faster since the commit doesn't even have to leave the kernel write cache. For us, this is the common case because most non-desktop systems have battery-backed write cache anyway. > Being slower is a bit surprising though. Exactly. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
Guillaume Smet wrote: > Also with SATA? If your SATA disk is lying about effectively SYNCing > the data, I'm not that surprised you don't see any improvement. Being > slower is a bit surprising though. The disc is not lying, but LVM does not support write barriers, so the result is the same. Indeed nothing is flushing the disc's write cache on fsync. However I could disable the disc's write cache entirely. One reason why I'm a ZFS fan boy lately. It just get's all of this right by *default*. Anyway, with some further testing my benchmark results vary so much that further discussion seems pointless. I'll repost when I really have reproducible values and followed Greg Smith's advice. -- Best regards, Hannes Dorbath
Greg Smith wrote: > Try something more in the range of 4 clients/CPU and set the scale to > closer to twice that (so with a dual-core system you might do 8 clients > and a scale of 16). If you really want to simulate a large number of > clients, do that on another system and connect to the server remotely. With 4 clients and scale 10 I get 246 TPS for synchronous_commit disabled and 634 TPS for synchronous_commit enabled. So the effect just got even stronger. That was for CFQ. For deadline they are now pretty close, but synchronous_commit disabled is still slower. 690 to 727. Values are AVG from 3 runs each. DROP/CREATE DATABASE and CHECKPOINT; before each run. -- Best regards, Hannes Dorbath