Re: WIP: "More fair" LWLocks - Mailing list pgsql-hackers

From Dmitry Dolgov
Subject Re: WIP: "More fair" LWLocks
Date
Msg-id CA+q6zcUn8ezTSwqtTMawa0BxgXdwWVY8E2EBQ-4V1djmqpPORw@mail.gmail.com
Whole thread Raw
In response to Re: WIP: "More fair" LWLocks  (Dmitry Dolgov <9erthalion6@gmail.com>)
Responses Re: WIP: "More fair" LWLocks
List pgsql-hackers
> On Mon, 13 Aug 2018 at 17:36, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
>
> 2) lwlock-far-2.patch
> New flag LW_FLAG_FAIR is introduced.  This flag is set when first
> shared locker in the row releases the lock.  When LW_FLAG_FAIR is set
> and there is already somebody in the queue, then shared locker goes to
> the queue.  Basically it means that first shared locker "holds the
> door" for other shared lockers to go without queue.
>
> I run pgbench (read-write and read-only benchmarks) on Amazon
> c5d.18xlarge virtual machine, which has 72 VCPU (approximately same
> power as 36 physical cores).  The results are attached
> (lwlock-fair-ro.png and lwlock-fair-rw.png).

I've tested the second patch a bit using my bpf scripts to measure the lock
contention. These scripts are still under the development, so there maybe some
rough edges and of course they make things slower, but so far the
event-by-event tracing correlates quite good with a perf script output. For
highly contented case (I simulated it using random_zipfian) I've even got some
visible improvement in the time distribution, but in an interesting way - there
is almost no difference in the distribution of time for waiting on
exclusive/shared locks, but a similar metric for holding shared locks is
somehow has bigger portion of short time frames:

# without the patch

Shared lock holding time

     hold time (us)      : count     distribution
         0 -> 1          : 17897059 |**************************              |
         2 -> 3          : 27306589 |****************************************|
         4 -> 7          : 6386375  |*********                               |
         8 -> 15         : 5103653  |*******                                 |
        16 -> 31         : 3846960  |*****                                   |
        32 -> 63         : 118039   |                                        |
        64 -> 127        : 15588    |                                        |
       128 -> 255        : 2791     |                                        |
       256 -> 511        : 1037     |                                        |
       512 -> 1023       : 137      |                                        |
      1024 -> 2047       : 3        |                                        |

# with the patch

Shared lock holding time
     hold time (us)      : count     distribution
         0 -> 1          : 20909871 |********************************        |
         2 -> 3          : 25453610 |****************************************|
         4 -> 7          : 6012183  |*********                               |
         8 -> 15         : 5364837  |********                                |
        16 -> 31         : 3606992  |*****                                   |
        32 -> 63         : 112562   |                                        |
        64 -> 127        : 13483    |                                        |
       128 -> 255        : 2593     |                                        |
       256 -> 511        : 1029     |                                        |
       512 -> 1023       : 138      |                                        |
      1024 -> 2047       : 7        |                                        |

So looks like the locks, queued as implemented in this patch, are released
faster than without this queue (probably it reduces contention in the less
expected way). I've tested it also using c5d.18xlarge, although with a bit
different options (more pgbench scale, shared_buffers, number of clients is
fixed at 72) and I'll try to make few more rounds with different options.

For the case of uniform distribution (just a normal read-write workload) in the
same environment I don't see yet any significant differences in time
distribution between the patched version and the master, which is a bit
surprising for me. Can you point out some analysis why this kind of "fairness"
introduces significant performance regression?

pgsql-hackers by date:

Previous
From: Jinhua Luo
Date:
Subject: Re: How to find local logical replication origin?
Next
From: Alexander Korotkov
Date:
Subject: Re: [HACKERS] Bug in to_timestamp().