On Tue, Aug 4, 2020 at 6:02 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> ... speedup of around 6% ...
I did some better testing. OS: Linux, storage: consumer SSD. I
repeatedly ran crash recovery on 3.3GB worth of WAL generated with 8M
pgbench transactions. I tested 3 different builds 7 times each and
used "ministat" to compare the recovery times. It told me that:
* Master is around 11% faster than last week before commit c5315f4f
"Cache smgrnblocks() results in recovery."
* This patch gives a similar speedup, bringing the total to around 25%
faster than last week (the time is ~20% less, the WAL processing speed
is ~1.25x).
My test fit in RAM and was all cached. With the patch, the recovery
process used 100% of a single core the whole time and stayed on that
core and the variance is low, but in the other builds it hovered
around 90% and hopped around as it kept getting rescheduled and the
variance was higher.
Of course, SLRU fsyncs aren't the only I/O stalls in a real system;
among others, there are also reads from faulting in referenced pages
that don't have full page images in the WAL. I'm working on that
separately, but that's a tad more complicated than this stuff.
Added to commit fest.
=== ministat output showing recovery times in seconds ===
x patched.dat
+ master.dat
* lastweek.dat
+------------------------------------------------------------------------------+
| * |
| x + * |
|x x xx + + ++ + + * **** |
| |AM| |_____AM____| |_____A_M__||
+------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 7 38.655 39.406 39.218 39.134857 0.25188849
+ 7 42.128 45.068 43.958 43.815286 0.91387758
Difference at 95.0% confidence
4.68043 +/- 0.780722
11.9597% +/- 1.99495%
(Student's t, pooled s = 0.670306)
* 7 47.187 49.404 49.203 48.904286 0.76793483
Difference at 95.0% confidence
9.76943 +/- 0.665613
24.9635% +/- 1.70082%
(Student's t, pooled s = 0.571477)