Re: pg_atomic_compare_exchange_*() and memory barriers - Mailing list pgsql-hackers

From Pavel Borisov
Subject Re: pg_atomic_compare_exchange_*() and memory barriers
Date
Msg-id CALT9ZEHDC1jz3z54hJ2_PvL+6D+u4niXGVxcBvbhHaNeGOXOhw@mail.gmail.com
Whole thread Raw
In response to Re: pg_atomic_compare_exchange_*() and memory barriers  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: pg_atomic_compare_exchange_*() and memory barriers
List pgsql-hackers
Hi, Alexander and Andres!

On Fri, 7 Mar 2025 at 20:40, Alexander Korotkov <aekorotkov@gmail.com> wrote:
>
> Hi, Andres.
>
> Thank you for reply.
>
> On Fri, Mar 7, 2025 at 6:02 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > On 2025-03-07 17:47:08 +0200, Alexander Korotkov wrote:
> > > While investigating a bug in the patch to get rid of WALBufMappingLock, I
> > > found that the surrounding pg_atomic_compare_exchange_u64() fixes the
> > > problem for me.
> >
> > That sounds more likely to be due to slowing down things enough to make a race
> > less likely to be hit. Or a compiler bug.
> >
> >
> > > That doesn't feel right because, according to the
> > > comments, both pg_atomic_compare_exchange_u32() and
> > > pg_atomic_compare_exchange_u64() should provide full memory barrier
> > > semantics.  So, I decided to investigate this further.
> > >
> > > In my case, the implementation of pg_atomic_compare_exchange_u64() is based
> > > on __atomic_compare_exchange_n().
> > >
> > > static inline bool
> > > pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
> > >                                     uint64 *expected, uint64 newval)
> > > {
> > >     AssertPointerAlignment(expected, 8);
> > >     return __atomic_compare_exchange_n(&ptr->value, expected, newval, false,
> > >                                        __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
> > > }
> > >
> > > According to the docs __ATOMIC_SEQ_CST enforces total ordering with *all
> > > other __ATOMIC_SEQ_CST operations*.  It only says about other
> > > __ATOMIC_SEQ_CST operations; nothing is said about regular reads/writes.
> > > This sounds quite far from our comment, promising full barrier semantics,
> > > which, in my understanding, means the equivalent of
> > > both pg_read_barrier()/pg_write_barrier(), which should barrier *all other
> > > read/writes*.
> >
> > A memory barrier in one process/thread always needs be paired with a barrier
> > in another process/thread. It's not enough to have a memory barrier on one
> > side but not the other.
>
> Yes, there surely should be a memory barrier on another side.  But
> does __ATOMIC_SEQ_CST works as a barrier for the regular read/write
> operations on the same side?
According to a reference posted by Andres [1]:
"Within a thread of execution, accesses (reads and writes) through
volatile lvalues cannot be reordered past observable side-effects
(including other volatile accesses) that are separated by a sequence
point within the same thread, but this order is not guaranteed to be
observed by another thread, since volatile access does not establish
inter-thread synchronization."

Also: "as soon as atomic operations that are not tagged
memory_order_seq_cst enter the picture, the sequential consistency is
lost"

So I think Alexander is right in the original post. The order of
everything not marked as ATOMIC_SEQ_CST (atomic or non-atomic
operations) compared to atomic operations marked by  ATOMIC_SEQ_CST is
not warranted.

> > __ATOMIC_SEQ_CST is used to implement the C11/C++11 barriers, and for those
> > it's more clearly formulated that that include acquire/release semantics.  The
> > standard formulation is a bit more complicated IIRC, but here's the
> > cppreference.com simplification:
> >
> > https://en.cppreference.com/w/c/atomic/memory_order
> > >  A load operation with this memory order performs an acquire operation, a
> > >  store performs a release operation, and read-modify-write performs both an
> > >  acquire operation and a release operation, plus a single total order exists
> > >  in which all threads observe all modifications in the same order (see
> > >  Sequentially-consistent ordering below).
>
>
> Thank you.  This is not yet clear for me.  I probably need to meditate
> more on this :)

I think __ATOMIC_ACQ_REL is needed for success, and __ATOMIC_ACQUIRE
might be sufficient for failure as failure of
__atomic_compare_exchange_n contains no modification of atomic
variable, so we want to see all modifications from the concurrent
operations, but will not enforce them to see any "our" change. [2]
Maybe I'm too optimistic here and having a full memory barrier is
better.

Thank you very much for considering this issue! I think the right
operation of atomic primitives is both very very important and also
hard to check thoroughly. I think the test that could reproduce the
reordering of atomic and non-atomic operations could be very useful in
regression set.

[1] https://en.cppreference.com/w/c/atomic/memory_order#Sequentially-consistent_ordering
[2] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

Best regards,
Pavel Borisov
Supabase



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: pg_atomic_compare_exchange_*() and memory barriers
Next
From: Alexander Korotkov
Date:
Subject: Re: pg_atomic_compare_exchange_*() and memory barriers