Hello Tomas,
Thanks again for these interesting benches.
> Overall, this means ~300M transactions in total for the un-throttled case, so
> sample with ~15M transactions available when computing the following charts.
Still a very sizable run!
> There results (including scripts for generating the charts) are here:
>
> https://github.com/tvondra/flushing-benchmark-2
This repository seems empty.
> 1) regular-latency.png
I'm wondering whether it would be clearer if the percentiles where
relative to the largest sample, not to itself, so that the figures from
the largest one would still be between 0 and 1, but the other (unpatched)
one would go between 0 and 0.85, that is would be cut short proportionnaly
to the actual performance.
> The two curves intersect at ~4ms, where both CDF reach ~85%. For the
> shorter transactions, the old code is slightly faster (i.e. apparently
> there's some per-transaction overhead).
I'm not sure how meaningfull is the crossing, because both curves do not
reflect the same performance. I think that they may not cross at all if
the normalization is with the same reference, i.e. the better run.
> 2) throttled-latency.png
>
> In the throttled case (i.e. when the system is not 100% utilized, so it's
> more representative of actual production use), the difference is quite
> clearly in favor of the new code.
Indeed, it is a no brainer.
> 3) throttled-schedule-lag.png
>
> Mostly just an alternative view on the previous chart, showing how much later
> the transactions were scheduled. Again, the new code is winning.
No brainer again. I infer from this figure that with the initial version
60% of transactions have trouble being processed on time, while this is
maybe about 35% with the new version.
--
Fabien.