Re: what to revert - Mailing list pgsql-hackers
From | Kevin Grittner |
---|---|
Subject | Re: what to revert |
Date | |
Msg-id | CACjxUsPgmm+LLG1+3d56EhCD8yEKP_b14zHGFOUpJp0Qx-J2pw@mail.gmail.com Whole thread Raw |
In response to | Re: what to revert (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: what to revert
Re: what to revert |
List | pgsql-hackers |
<div dir="ltr">On Mon, May 9, 2016 at 9:01 PM, Tomas Vondra <<a href="mailto:tomas.vondra@2ndquadrant.com">tomas.vondra@2ndquadrant.com</a>>wrote:<br /><br />> Over the past few daysI've been running benchmarks on a fairly<br />> large NUMA box (4 sockets, 32 cores / 64 with HR, 256GB of RAM)<br/>> to see the impact of the 'snapshot too old' - both when disabled<br />> and enabled with various valuesin the old_snapshot_threshold<br />> GUC.<br /><br />Thanks!<br /><br />> The benchmark is a simple read-onlypgbench with prepared<br />> statements, i.e. doing something like this:<br />><br />> pgbench -S -Mprepared -j N -c N<br /><br />Do you have any plans to benchmark cases where the patch can have a<br />benefit? (Clearly,nobody would be interested in using the feature<br />with a read-only load; so while that makes a good "worst case"<br/>scenario and is very valuable for testing the "off" versus<br />"reverted" comparison, it's not an intended useor one that's<br />likely to happen in production.)<br /><br />> master-10-new - 91fd1df4 + old_snapshot_threshold=10<br/>> master-10-new-2 - 91fd1df4 + old_snapshot_threshold=10 (rerun)<br /><br />So, these runswere with identical software on the same data? Any<br />differences are just noise?<br /><br />> * The results area bit noisy, but I think in general this shows<br />> that for certain cases there's a clearly measurable difference<br/>> (up to 5%) between the "disabled" and "reverted" cases. This is<br />> particularly visible on thesmallest data set.<br /><br />In some cases, the differences are in favor of disabled over<br />reverted.<br /><br />>* What's fairly strange is that on the largest dataset (scale<br />> 10000), the "disabled" case is actually consistentlyfaster than<br />> "reverted" - that seems a bit suspicious, I think. It's possible<br />> that I did therevert wrong, though - the revert.patch is<br />> included in the tgz. This is why I also tested 689f9a05, but<br />>that's also slower than "disabled".<br /><br />Since there is not a consistent win of disabled or reverted over<br/>the other, and what difference there is is often far less than the<br />difference between the two runs with identicalsoftware, is there<br />any reasonable interpretation of this except that the difference is<br />"in the noise"?<br/><br />> * The performance impact with the feature enabled seems rather<br />> significant, especially onceyou exceed the number of physical<br />> cores (32 in this case). Then the drop is pretty clear - often<br />>~50% or more.<br />><br />> * 7e3da1c4 claims to bring the performance within 5% of the<br />> disabled case,but that seems not to be the case.<br /><br />The commit comment says "At least in the tested case this brings<br />performancewithin 5% of when the feature is off, compared to<br />several times slower without this patch." The testedcase was a<br />read-write load, so your read-only tests do nothing to determine<br />whether this was the case ingeneral for this type of load.<br />Partly, the patch decreases chasing through HOT chains and<br />increases the numberof HOT updates, so there are compensating<br />benefits of performing early vacuum in a read-write load.<br /><br />>What it however does is bringing the 'non-immediate' cases close<br />> to the immediate ones (before the performancedrop came much<br />> sooner in these cases - at 16 clients).<br /><br />Right. This is, of course, just thefirst optimization, that we<br />were able to get in "under the wire" before beta, but the other<br />optimizations underconsideration would only tend to bring the<br />"enabled" cases closer together in performance, not make an enabled<br/>case perform the same as when the feature was off -- especially for<br />a read-only workload.<br /><br />>* It's also seems to me the feature greatly amplifies the<br />> variability of the results, somehow. It's not uncommonto see<br />> results like this:<br />><br />> master-10-new-2 235516 331976 133316 155563 133396<br />><br />> where after the first runs (already fairly variable) the<br />> performance tanksto ~50%. This happens particularly with higher<br />> client counts, otherwise the max-min is within ~5% of the max.<br/>> There are a few cases where this happens without the feature<br />> (i.e. old master, reverted or disabled),but it's usually much<br />> smaller than with it enabled (immediate, 10 or 60). See the<br />> 'summary'sheet in the ODS spreadsheet.<br />><br />> I don't know what's the problem here - at first I thought that<br/>> maybe something else was running on the machine, or that<br />> anti-wraparound autovacuum kicked in, butthat seems not to be <br />> the case. There's nothing like that in the postgres log (also <br />> included in the.tgz).<br /><br />I'm inclined to suspect NUMA effects. It would be interesting to<br />try with the NUMA patch and cpusetI submitted a while back or with<br />fixes in place for the Linux scheduler bugs which were reported<br />last month. Which kernel version was this?<br /><br />--<br />Kevin Grittner<br />EDB: <a href="http://www.enterprisedb.com">http://www.enterprisedb.com</a><br/>The Enterprise PostgreSQL Company<br /><br /></div>
pgsql-hackers by date: