On Mon, Sep 14, 2020 at 11:35 PM David Rowley <dgrowleyml@gmail.com> wrote:
> I just did some benchmarking with this patch using the same recovery
> benchmark that I used in [1] and also the two patches that I posted in
> [2]. Additionally, I added a PANIC at the end of recovery so that I
> could repeat the recovery over and over again with the same WAL.
>
> [data]
N Min Max Median Avg Stddev
x 10 62.15 67.06 64.86 64.132 1.6188528
+ 10 59.6 63.81 63.13 62.233 1.4983031
Difference at 95.0% confidence
-1.899 +/- 1.46553
-2.96108% +/- 2.28517%
(Student's t, pooled s = 1.55974)
Thanks! Hmm, small but apparently significant and in line with
Jakub's report, and I suppose the effect will be greater with other
nearby recovery performance patches applied that halve the times.
Annoyingly, I can't reproduce this speedup on my local i9-9900; maybe
it requires a different CPU...
> I looked over the patch and the only thing I saw was that we might
> also want to remove the following line:
>
> #define DEF_FFACTOR 1 /* default fill factor */
Right, thanks. Fixed in the attached.
> The 2nd most costly call to hash_search_with_hash_value() came in via
> hash_search() via smgropen(). That does use HASH_ENTER, which could
> have triggered the divide code. The main caller of smgropen() was
> XLogReadBufferExtended().
>
> So, it looks unlikely that any gains we are seeing are from improved
> buffer lookups. It's more likely they're coming from more optimal
> XLogReadBufferExtended()
I think we call smgropen() twice for every buffer referenced in the
WAL: XLogReadBufferExtended() and again in
ReadBufferWithoutRelcache(). We could reduce it to once with some
refactoring, but I am looking into whether I can reduce it to zero as
a side-effect of another change, more soon...