Currently, the assembly for TAS() on x86 does a non-locking test before
using an atomic operation to attempt to acquire the spinlock:
__asm__ __volatile__( " cmpb $0,%1 \n" " jne 1f \n" " lock \n" " xchgb
%0,%1 \n" "1: \n"
: "+q"(_res), "+m"(*lock)
:
: "memory", "cc");
The reason this is a good idea is that if we fail to immediately acquire
the spinlock, s_lock() will spin SPINS_PER_DELAY times in userspace
calling TAS() each time before going to sleep. If we do an atomic
operation for each spin, this generates a lot more bus traffic than is
necessary. Doing a non-locking test (followed by an atomic operation to
acquire the spinlock if appropriate) is therefore better on SMP systems.
Currently x86 is the only platform on which we do this -- ISTM that all
the other platforms that implement spinlocks via atomic operations could
benefit from this technique.
We could fix this by tweaking each platform's assembler to add a
non-blocking test, but there might be an easier way. Rather than
modifying platform-specific assembler, I believe this C sequence is
equivalent to the non-locking test:
volatile slock_t *lock = ...;
if (*lock == 0) TAS(lock);
Because the lock variable is volatile, the compiler should reload it
from memory for each loop iteration. (If this is actually not a
sufficient non-locking test, please let me know...)
We could add a new s_lock.h macro, TAS_INNER_LOOP(), whose default
implementation would be:
#define TAS_INNER_LOOP(lock) \ if ((*lock) == 0) \ TAS(lock);
And then remove the x86-specific non-locking test from TAS.
Comments?
-Neil