Manfred Spraul <manfred@colorfullife.com> writes:
> Intel recommends to add a special pause instruction into spinlock busy
> loops. It's necessary for hyperthreading - without it, the cpu can't
> figure out that a logical thread does no useful work and incorrectly
> awards lots of execution resources to that thread. Additionally, it's
> supposed to reduce the time the cpu needs to recover from the
> (mispredicted) branch after the spinlock was obtained.
Don't you have to put it in a specific place in the loop to make that
work? If not, why not? I doubt that rep;nop is magic enough to
recognize the loop that will be generated from s_lock()'s code.
My guess is that it'd be more useful to insert the rep;nop into the
failure branch of the TAS macro and forget about the separate CPU_DELAY
construct. This would allow you to control where exactly rep;nop
appears relative to the xchgb.
> Additionally I've increased the number of loops from 100 to 1000
I think this change is almost certainly counterproductive; for any
platform other than the Xeon, remove "almost".
> + #ifndef HAS_CPU_DELAY
> + #define CPU_DELAY() cpu_delay()
> +
> + static __inline__ void
> + cpu_delay(void)
> + {
> + }
> + #endif
This breaks every non-gcc compiler in the world (or at least all those
that don't recognize __inline__). If you really want to keep CPU_DELAY,
consider
#ifndef CPU_DELAY
#define CPU_DELAY()
#endif
but as stated above, I'm dubious that the bottom of the s_lock loop
is the place to be adding anything anyway.
regards, tom lane