Latch for the WAL writer - further reducing idle wake-ups. - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Latch for the WAL writer - further reducing idle wake-ups. |
Date | |
Msg-id | CAEYLb_U7S+Z8JOnEOY31w9Hcz-SavxrK4TTMRn4d1M+NxYEy+Q@mail.gmail.com Whole thread Raw |
Responses |
Re: Latch for the WAL writer - further reducing idle wake-ups.
|
List | pgsql-hackers |
Attached patch latches up the WAL Writer, reducing wake-ups and thus saving electricity in a way that is more-or-less analogous to my work on the BGWriter: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=6d90eaaa89a007e0d365f49d6436f35d2392cfeb I am hoping this gets into 9.2 . I am concious of the fact that this is quite late, but it the patch addresses an open item, the concluding part of a much wider feature. In any case, it is a useful patch, that ought to be committed at some point. I should point out: 1. This functionality was covered by the group commit patch that I worked on back in January, which was submitted in advance of the commitfest deadline. However, an alternative implementation was ultimately committed that did not consider WAL Writer wake-ups. 2. The WAL writer is the most important auxiliary process to latch-up. Though it is tied with the BGWriter at 5 wake-ups per second by default, I consider the WAL Writer to be more important than the BGWriter because I find it much more plausible that the WAL Writer really won't need to be around for much of the time, as with a read-mostly work load. "Cloud" type deployments often have read-mostly workloads, so we can still save some power even if the DB is actually servicing lots of read queries. That being the case, it would be a shame if we didn't get this last one in, as it adds a lot more value than any of the other patches. 3. This is a fairly simple patch; as I've said, it works in a way that is quite analogous to the BGWriter patch, applying lessons learned there. With this patch, my instrumentation shows that wake-ups when Postgres reaches a fully idle state are just 2.7 per second for the entire postgres process group, quite an improvement on the 7.6 per second in HEAD. This is exactly what you'd expect from a reduction of 5 wake-ups per second to 0.1 per second on average for the WAL Writer. I have determined this with PowerTOP 1.13 on my Fedora 16 laptop. Here is an example session, began after the cluster reached a fully idle state, with this patch applied (if, alternatively, I want to see things at per-process granularity, I can get that from PowerTOP 1.98 beta 1, which is available from my system's package manager): [peter@peterlaptop powertop-1.13]$ sudo ./powertop -d --time=300 [sudo] password for peter: PowerTOP 1.13 (C) 2007 - 2010 Intel Corporation Collecting data for 300 seconds Cn Avg residency C0 (cpu running) ( 2.8%) polling 0.0ms ( 0.0%) C1 mwait 0.5ms ( 1.0%) C2 mwait 0.9ms ( 0.6%) C3 mwait 1.4ms ( 0.1%) C4 mwait 6.7ms (95.4%) P-states (frequencies) 2.61 Ghz 5.7% 1.80 Ghz 0.1% 1200 Mhz 0.1% 1000 Mhz 0.2% 800 Mhz 93.5% Wakeups-from-idle per second : 171.3 interval: 300.0s no ACPI power usage estimate available Top causes for wakeups: 23.0% (134.5) chrome ***SNIP*** 0.5% ( 2.7) postgres ***SNIP*** This is a rather low number, that will make us really competitive with other RDBMSs in this area. Recall that we started from 11.5 wake-ups for an idle Postgres cluster with a default configuration. To put the 2.7 number in context, I measured MySQL's wake-ups at 2.2 last year (mysql-server version 5.1.56, Fedora 14), though I subsequently saw much higher numbers (over 20 per second) for version 5.5.19 on Fedora 16, so you should probably take that with a grain of salt - I don't know anything about MySQL, and so cannot really be sure that I'm making an objective comparison in comparing that number with the number of wake-ups Postgres has with a stock postgresql.conf. I've employed the same trick used when a buffer is dirtied for the BGWriter - most of the time, the SetLatch() calls will check a single flag, and find it already set. We are careful to only "arm" the latch with a call to ResetLatch() when it is really needed. Rather than waiting for the clocksweep to be lapped, we wait for a set number of iterations of consistent inactivity. I've made the WAL Writer use its process latch, rather than the latch that was previously within XLogCtl. This seems much more idiomatic, as in doing so we reserve the right to register generic signal handlers. With a non-process latch, we'd have to worry about signal invalidation issues on an ongoing basis, since the handler wouldn't be calling SetLatch() against the latch we waited on. I have also added a comment in latch.h generally advising against ad-hoc shared latches where . I took initial steps to quantify the performance hit from this patch. A simple "insert.sql" pgbench-tools benchmark on my laptop, with a generic configuration showed no problems, though I do not assume that this conclusively proves the case. Results: http://walwriterlatch.staticloud.com/ My choice of XLogInsert() as an additional site at which to call SetLatch() was one that wasn't taken easily, and frankly I'm not entirely confident that I couldn't have been just as effective while placing the SetLatch() call in a less hot, perhaps higher-level codepath. That said, MarkBufferDirty() is also a very hot code path, and it's where one of the SetLatch() calls goes in the earlier BGWriter patch, besides which I haven't been able to quantify any performance hit as yet. Thoughts? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Attachment
pgsql-hackers by date: