Re: Division in dynahash.c due to HASH_FFACTOR - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Division in dynahash.c due to HASH_FFACTOR
Date
Msg-id CA+hUKG+67j4fUiuSP2d8U5r0gROBpx=iPgUQd-9DT=y-zv-pCA@mail.gmail.com
Whole thread Raw
In response to Re: Division in dynahash.c due to HASH_FFACTOR  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Division in dynahash.c due to HASH_FFACTOR  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
On Mon, Sep 14, 2020 at 11:35 PM David Rowley <dgrowleyml@gmail.com> wrote:
> I just did some benchmarking with this patch using the same recovery
> benchmark that I used in [1] and also the two patches that I posted in
> [2]. Additionally, I added a PANIC at the end of recovery so that I
> could repeat the recovery over and over again with the same WAL.
>
> [data]

    N           Min           Max        Median           Avg        Stddev
x  10         62.15         67.06         64.86        64.132     1.6188528
+  10          59.6         63.81         63.13        62.233     1.4983031
Difference at 95.0% confidence
    -1.899 +/- 1.46553
    -2.96108% +/- 2.28517%
    (Student's t, pooled s = 1.55974)

Thanks!  Hmm, small but apparently significant and in line with
Jakub's report, and I suppose the effect will be greater with other
nearby recovery performance patches applied that halve the times.
Annoyingly, I can't reproduce this speedup on my local i9-9900; maybe
it requires a different CPU...

> I looked over the patch and the only thing I saw was that we might
> also want to remove the following line:
>
> #define DEF_FFACTOR    1 /* default fill factor */

Right, thanks.  Fixed in the attached.

> The 2nd most costly call to hash_search_with_hash_value() came in via
> hash_search() via smgropen(). That does use HASH_ENTER, which could
> have triggered the divide code. The main caller of smgropen() was
> XLogReadBufferExtended().
>
> So, it looks unlikely that any gains we are seeing are from improved
> buffer lookups. It's more likely they're coming from more optimal
> XLogReadBufferExtended()

I think we call smgropen() twice for every buffer referenced in the
WAL: XLogReadBufferExtended() and again in
ReadBufferWithoutRelcache().  We could reduce it to once with some
refactoring, but I am looking into whether I can reduce it to zero as
a side-effect of another change, more soon...

Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: logtape.c stats don't account for unused "prefetched" block numbers
Next
From: Jeff Davis
Date:
Subject: Re: logtape.c stats don't account for unused "prefetched" block numbers