Re: [HACKERS] kqueue - Mailing list pgsql-hackers

From Mateusz Guzik
Subject Re: [HACKERS] kqueue
Date
Msg-id CAGudoHGc-7SnrsDRrV5yhzBhNVS=vmGjM4TrRQqAy8eUsqnRbQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] kqueue  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: [HACKERS] kqueue  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
On Mon, May 21, 2018 at 9:03 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
On Wed, Apr 11, 2018 at 1:05 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> I heard through the grapevine of some people currently investigating
> performance problems on busy FreeBSD systems, possibly related to the
> postmaster pipe.  I suspect this patch might be a part of the solution
> (other patches probably needed to get maximum value out of this patch:
> reuse WaitEventSet objects in some key places, and get rid of high
> frequency PostmasterIsAlive() read() calls).  The autoconf-fu in the
> last version bit-rotted so it seemed like a good time to post a
> rebased patch.


Hi everyone,

I have benchmarked the change on a FreeBSD box and found an big
performance win once the number of clients goes beyond the number of
hardware threads on the target machine. For smaller number of clients
the win was very modest.

The test was performed few weeks ago.

For convenience PostgreSQL 10.3 as found in the ports tree was used.

3 variants were tested:
- stock 10.3
- stock 10.3 + pdeathsig
- stock 10.3 + pdeathsig + kqueue

Appropriate patches were provided by Thomas.

In order to keep this message PG-13 I'm not going to show the actual
script, but a mere outline:

for i in $(seq 1 10): do
        for t in vanilla pdeathsig pdeathsig_kqueue; do
                start up the relevant version
                for c in 32 64 96; do
                        pgbench -j 96 -c $c -T 120 -M prepared -S -U bench -h 172.16.0.2 -P1 bench > ${t}-${c}-out-warmup 2>&1
                        pgbench -j 96 -c $c -T 120 -M prepared -S -U bench -h 172.16.0.2 -P1 bench > ${t}-${c}-out 2>&1
                done
                shutdown the relevant version
done

Data from the warmup is not used. All the data was pre-read prior to the
test.

PostgreSQL was configured with 32GB of shared buffers and 200 max
connections, otherwise it was the default.

The server is:
Intel(R) Xeon(R) Gold 6134 CPU @ 3.20GHz
2 package(s) x 8 core(s) x 2 hardware threads

i.e. 32 threads in total.

running FreeBSD -head with 'options NUMA' in kernel config and
sysctl net.inet.tcp.per_cpu_timers=1 on top of zfs.

The load was generated from a different box over a 100Gbit ethernet link.

x cumulative-tps-vanilla-32
+ cumulative-tps-pdeathsig-32
* cumulative-tps-pdeathsig_kqueue-32
+------------------------------------------------------------------------+
|+   + x+*     x+  *  x       *        + * *       * * **  *  **        *|
|   |_____|__M_A___M_A_____|____|             |________MA________|       |
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  10     442898.77     448476.81     444805.17     445062.08     1679.7169
+  10      442057.2     447835.46     443840.28     444235.01     1771.2254
No difference proven at 95.0% confidence
*  10     448138.07     452786.41     450274.56     450311.51     1387.2927
Difference at 95.0% confidence
        5249.43 +/- 1447.41
        1.17948% +/- 0.327501%
        (Student's t, pooled s = 1540.46)
x cumulative-tps-vanilla-64
+ cumulative-tps-pdeathsig-64
* cumulative-tps-pdeathsig_kqueue-64
+------------------------------------------------------------------------+
|                                                                     ** |
|                                                                     ** |
|  xx  x +                                                            ***|
|++**x *+*++                                                          ***|
|  ||_A|M_|                                                           |A |
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  10     411849.26      422145.5     416043.77      416061.9     3763.2545
+  10     407123.74     425727.84     419908.73      417480.7     6817.5549
No difference proven at 95.0% confidence
*  10     542032.71     546106.93     543948.05     543874.06     1234.1788
Difference at 95.0% confidence
        127812 +/- 2631.31
        30.7195% +/- 0.809892%
        (Student's t, pooled s = 2800.47)
x cumulative-tps-vanilla-96
+ cumulative-tps-pdeathsig-96
* cumulative-tps-pdeathsig_kqueue-96
+------------------------------------------------------------------------+
|                                                                      * |
|                                                                      * |
|                                                                      * |
|                                                                      * |
|  + x                                                                 * |
|  *xxx+                                                               **|
|+ *****+                                                            * **|
|  |MA||                                                              |A||
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  10      325263.7        336338     332399.16     331321.82     3571.2478
+  10     321213.33     338669.66     329553.78     330903.58      5652.008
No difference proven at 95.0% confidence
*  10     503877.22     511449.96     508708.41     508808.51     2016.9483
Difference at 95.0% confidence
        177487 +/- 2724.98
        53.5693% +/- 1.17178%
        (Student's t, pooled s = 2900.16)


--
Mateusz Guzik <mjguzik gmail.com>

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: [HACKERS] kqueue
Next
From: Craig Ringer
Date:
Subject: Message on end of cascading physical replica timeline is unhelpful