Re: Wait free LW_SHARED acquisition - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Wait free LW_SHARED acquisition
Date
Msg-id 20130927213947.GC9819@awork2.anarazel.de
Whole thread Raw
In response to Re: Wait free LW_SHARED acquisition  (Florian Pflug <fgp@phlo.org>)
List pgsql-hackers
On 2013-09-27 14:46:50 +0200, Florian Pflug wrote:
> On Sep27, 2013, at 00:55 , Andres Freund <andres@2ndquadrant.com> wrote:
> > So the goal is to have LWLockAcquire(LW_SHARED) never block unless
> > somebody else holds an exclusive lock. To produce enough appetite for
> > reading the rest of the long mail, here's some numbers on a
> > pgbench -j 90 -c 90 -T 60 -S (-i -s 10) on a 4xE5-4620
> >
> > master + padding: tps = 146904.451764
> > master + padding + lwlock: tps = 590445.927065
> >
> > That's rougly 400%.
>
> Interesting. I played with pretty much the same idea two years or so ago.
> At the time, I compared a few different LWLock implementations. Those
> were AFAIR
>
>   A) Vanilla LWLocks
>   B) A + an atomic-increment fast path, very similar to your proposal
>   C) B but with a partitioned atomic-increment counter to further
>      reduce cache-line contention
>   D) A with the spinlock-based queue replaced by a lockless queue
>
> At the time, the improvements seemed to be negligible - they looked great
> in synthetic benchmarks of just the locking code, but didn't translate
> to improved TPS numbers. Though I think the only version that ever got
> tested on more than a handful of cores was C…

I think you really need multi-socket systems to see the big benefits
from this. My laptop barely shows any improvements, while my older 2
socket workstation already shows some in workloads that have more
contention than pgbench -S.

From a quick look, you didn't have any sleeping queueing in at least one
of the variants in there? In my tests, that was tremendously important
to improve scaling if there was any contention. Which is not surprising
in the end, because otherwise you essentially have rw-spinlocks which
really aren't suitable for many of the lwlocks we use.

Getting the queueing semantics, including releaseOK, right was what took
me a good amount of time, the atomic ops part was pretty quick...

Greetings,

Andres Freund

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: logical changeset generation v6
Next
From: Andres Freund
Date:
Subject: Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.