I wrote:
> Anyway, I don't have a big objection to applying this. My concern
> is more that we need to be taking a harder look at other parts of
> the atomics infrastructure, because tweaks there are likely to buy
> much more.
I went ahead and pushed these patches, adding __sync_fetch_and_sub
since gcc seems to provide that on the same footing as these other
functions.
Looking at these generic functions a bit closer, I notice that
most of them are coded like
old = pg_atomic_read_u32_impl(ptr);while (!pg_atomic_compare_exchange_u32_impl(ptr, &old, old | or_)) /* skip */;
but there's an exception: pg_atomic_exchange_u64_impl just does
old = ptr->value;while (!pg_atomic_compare_exchange_u64_impl(ptr, &old, xchg_)) /* skip */;
That's interesting. Why not pg_atomic_read_u64_impl there?
I think that the simple read is actually okay as it stands: if we
are unlucky enough to get a torn read, the first compare/exchange
will fail to compare equal, and it will replace "old" with a valid
atomically-read value, and then the next loop iteration has a chance
to succeed. Therefore there's no need to pay the extra cost of a
locked read instead of an unlocked one.
However, if that's the reasoning, why don't we make all of these
use simple reads? It seems unlikely that a locked read is free.
If there's actually a reason for pg_atomic_exchange_u64_impl to be
different from the rest, it needs to have a comment explaining why.
regards, tom lane