Re: spinlock->pthread_mutex : first results with Jeff's pgbench+plsql - Mailing list pgsql-hackers

From Tom Lane
Subject Re: spinlock->pthread_mutex : first results with Jeff's pgbench+plsql
Date
Msg-id 18400.1341246002@sss.pgh.pa.us
Whole thread Raw
In response to spinlock->pthread_mutex : first results with Jeff's pgbench+plsql  (Nils Goroll <slink@schokola.de>)
Responses Re: spinlock->pthread_mutex : first results with Jeff's pgbench+plsql
List pgsql-hackers
Nils Goroll <slink@schokola.de> writes:
> How I read this under the assumption that the test was correct and valid _and_
> can be reproduced independently:

> * for very low concurrency, the existing spinlock implementation is ideal -
>   we can't do any better both in terms of resulting sps and resource
>   consumption.

>   One path to explore here would be PTHREAD_MUTEX_ADAPTIVE_NP, which essentially
>   is the same as a spinlock for contended case with very low lock aquisition
>   time. The code which I have tested uses PTHREAD_MUTEX_NORMAL, which, on Linux,
>   will always syscall for the contended case.

>   Quite clearly the overhead is with futexes syscalling, because kernel
>   resource consumption is 3x higher with the patch than without.

> * With this benchmark, for "half" concurrency in the order of 0.5 x #cores,
>   spinlocks still yield better tps, but resource overhead for spinlocks starts
>   to take off and futexes are already 40% more efficient, despite the fact that
>   spinlocks still have a 25% advantage in terms of sps.

> * At "full" concurrency (64 threads on 64 cores), resource consumption of
>   the spinlocks leads to almost doubled overall resource consumption and
>   the increased efficiency starts to pay off in terms of sps

> * and for the "quadruple overloaded" case (2x128 threads on 64 cores), spinlock
>   contention really brings the system down and sps drops to half.

These conclusions seem plausible, though I agree we'd want to reproduce
similar behavior elsewhere before acting on the results.

What this seems to me to show, though, is that pthread mutexes are not
fundamentally a better technology than what we have now in spinlocks.
The problem is that the spinlock code is not adapting well to very high
levels of contention.  I wonder whether a better and less invasive fix
could be had by playing with the rules for adjustment of
spins_per_delay.  Right now, those are coded without any thought about
high-contention cases.  In particular I wonder whether we ought to
try to determine which individual locks are high-contention, and behave
differently when trying to acquire those.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Patch: add conversion from pg_wchar to multibyte
Next
From: Robert Haas
Date:
Subject: Re: spinlock->pthread_mutex : first results with Jeff's pgbench+plsql