Andres Freund <andres@2ndquadrant.com> writes:
> On 2013-10-16 14:30:34 -0400, Robert Haas wrote:
>>> But _and, _or are really useful because they can be used to atomically
>>> clear and set flag bits.
>> Until we have some concrete application that requires this
>> functionality for adequate performance, I'd be inclined not to bother.
>> I think we have bigger fish to fry, and (to extend my nautical
>> metaphor) trying to get this working everywhere might amount to trying
>> to boil the ocean.
> Well, I actually have a very, very preliminary patch using them in the
> course of removing the spinlocks in PinBuffer() (via LockBufHdr()).
I think we need to be very, very wary of making our set of required
atomics any larger than it absolutely has to be. The more stuff we
require, the closer we're going to be to making PG a program that only
runs well on Intel. I am not comforted by the "gcc will provide good
implementations of the atomics everywhere" argument, because in fact
it won't. See for example the comment associated with our only existing
use of gcc atomics:
* On ARM, we use __sync_lock_test_and_set(int *, int) if available, and if* not fall back on the SWPB instruction.
SWPBdoes not work on ARMv6 or* later, so the compiler builtin is preferred if available. Note also that* the int-width
variantof the builtin works on more chips than other widths.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
That's not a track record that should inspire much faith that complete
sets of atomics will exist elsewhere. What's more, we don't just need
atomics that *work*, we need atomics that are *fast*, else the whole
exercise turns into pessimization not improvement. The more atomic ops
we depend on, the more likely it is that some of them will involve kernel
support on some chips, destroying whatever performance improvement we
hoped to get.
> ... The only thing I'd touch around platforms in that patch is
> adding a generic fallback pg_atomic_test_and_set() to s_lock.h and
> remove the special case usages of __sync_lock_test_and_set() (Arm64,
> some 32 bit arms).
Um, if we had a "generic" pg_atomic_test_and_set(), s_lock.h would be
about one line long. The above quote seems to me to be exactly the kind
of optimism that is not warranted in this area.
regards, tom lane