Re: spinlocks on powerpc - Mailing list pgsql-hackers

From Robert Haas
Subject Re: spinlocks on powerpc
Date
Msg-id CA+TgmoYJhfyRR28ASt-FgB7ubDrdWmZ+m722XcDYdv4mEwUEBQ@mail.gmail.com
Whole thread Raw
In response to Re: spinlocks on powerpc  (Jeremy Harris <jgh@wizmail.org>)
Responses Re: spinlocks on powerpc  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, Jan 3, 2012 at 3:05 PM, Jeremy Harris <jgh@wizmail.org> wrote:
> On 2012-01-03 04:44, Robert Haas wrote:
>> On read-only workloads, you get spinlock contention, because everyone
>> who wants a snapshot has to take the LWLock mutex to increment the
>> shared lock count and again (just a moment later) to decrement it.
>
> Does the LWLock protect anything but the shared lock count?  If not
> then the usually quickest manipulation is along the lines of:
>
> loop: lwarx r5,0,r3  #load and reserve
>        add     r0,r4,r5 #increment word
>        stwcx. r0,0,r3  #store new value if still reserved
>        bne-    loop      #loop if lost reservation
>
> (per IBM's software ref manual,
>  https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/852569B20050FF778525699600719DF2
> )
>
> The same sort of thing generally holds on other instruction-sets also.

Sure, but the actual critical section is not that simple.  You might
look at the code for LWLockAcquire() if you're curious.

> Also, heavy-contention locks should be placed in cache lines away from other
> data (to avoid thrashing the data cache lines when processors are fighting
> over the lock cache lines).

Yep.  This is possibly a problem, and has been discussed before, but I
don't think we have any firm evidence that it's a problem, or how much
padding helps.  The heavily contended LWLocks are mostly
non-consecutive, except perhaps for the buffer mapping locks.

It's been suggested to me that we should replace our existing LWLock
implementation with a CAS-based implementation that crams all the
relevant details into a single 8-byte word.  The pointer to the head
of the wait queue, for example, could be stored as an offset into the
allProcs array rather than a pointer value, which would allow it to be
stored in 24 bits rather than 8 bytes.  But there's not quite enough
bit space to make it work without making compromises -- most likely,
reducing the theoretical upper limit on MaxBackends from 2^24 to, say,
2^22.  Even if we were willing to do that, the performance benefits of
using atomics here are so far unproven... which doesn't mean they
don't exist, but someone's going to have to do some work to show that
they do.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Collect frequency statistics for arrays
Next
From: Noah Misch
Date:
Subject: Re: Collect frequency statistics for arrays