Let's think about this:
What is the advantage of spinning?
On a uniprocessor box, there is "never" any advantage because you must release
the CPU in order to allow the process which owns the lock to run long enough to
release it.
On an SMP box, this is a bit more complicated. If you have two CPUs, then
maybe, one process can spin, but obviously, more than one spinner is wasting
CPU, one of the spinners must release its time slice in order for another
process release the resource.
Is there a global area where a single count of all the processes spinning can
be kept? That way, when a process fails to acquire a lock, and there are
already (num_cpus -1) processes spinning, they can call select() right away.
Does this sound like an interesting approach?