SpinLockAcquire and SpinLockRelease is broken on ARM/ARM64? (Re: sinvaladt.c: remove msgnumLock, use atomic operations on maxMsgNum) - Mailing list pgsql-hackers

From Yura Sokolov
Subject SpinLockAcquire and SpinLockRelease is broken on ARM/ARM64? (Re: sinvaladt.c: remove msgnumLock, use atomic operations on maxMsgNum)
Date
Msg-id 37dbaeaa-8b14-40dc-96e0-5eba595f0f0a@postgrespro.ru
Whole thread Raw
In response to Re: sinvaladt.c: remove msgnumLock, use atomic operations on maxMsgNum  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
21.03.2025 19:33, Andres Freund пишет:
> Hi,
> 
> On 2025-03-21 14:35:16 +0300, Yura Sokolov wrote:
>> From 080c9e0de5e6e10751347e1ff50b65df424744cb Mon Sep 17 00:00:00 2001
>> From: Yura Sokolov <y.sokolov@postgrespro.ru>
>> Date: Mon, 3 Feb 2025 11:58:33 +0300
>> Subject: [PATCH v2] sinvaladt.c: use atomic operations on maxMsgNum
>>
>> msgnumLock spinlock could be highly contended.
>> Comment states it was used as memory barrier.
>> Lets use atomic ops with memory barriers directly instead.
>>
>> Note: patch uses pg_read_barrier()/pg_write_barrier() instead of
>> pg_atomic_read_membarrier_u32()/pg_atomic_write_membarrier_u32() since
>> no full barrier semantic is required, and explicit read/write barriers
>> are cheaper at least on x86_64.
> 
> Is it actually true that full barriers aren't required?  I think we might
> actually rely on a stronger ordering.
> 
> 
>> @@ -506,10 +493,9 @@ SIGetDataEntries(SharedInvalidationMessage *data, int datasize)
>>       */
>>      stateP->hasMessages = false;
>>
>> -    /* Fetch current value of maxMsgNum using spinlock */
>> -    SpinLockAcquire(&segP->msgnumLock);
>> -    max = segP->maxMsgNum;
>> -    SpinLockRelease(&segP->msgnumLock);
>> +    /* Fetch current value of maxMsgNum using atomic with memory barrier */
>> +    max = pg_atomic_read_u32(&segP->maxMsgNum);
>> +    pg_read_barrier();
>>
>>      if (stateP->resetState)
>>      {
>>         /*
>>          * Force reset.  We can say we have dealt with any messages added
>>          * since the reset, as well; and that means we should clear the
>>          * signaled flag, too.
>>          */
>>         stateP->nextMsgNum = max;
>>         stateP->resetState = false;
>>         stateP->signaled = false;
>>         LWLockRelease(SInvalReadLock);
>>         return -1;
>>     }
> 
> vs
> 
>> @@ -410,17 +398,16 @@ SIInsertDataEntries(const SharedInvalidationMessage *data, int n)
>>          /*
>>           * Insert new message(s) into proper slot of circular buffer
>>           */
>> -        max = segP->maxMsgNum;
>> +        max = pg_atomic_read_u32(&segP->maxMsgNum);
>>          while (nthistime-- > 0)
>>          {
>>              segP->buffer[max % MAXNUMMESSAGES] = *data++;
>>              max++;
>>          }
>>
>> -        /* Update current value of maxMsgNum using spinlock */
>> -        SpinLockAcquire(&segP->msgnumLock);
>> -        segP->maxMsgNum = max;
>> -        SpinLockRelease(&segP->msgnumLock);
>> +        /* Update current value of maxMsgNum using atomic with memory barrier */
>> +        pg_write_barrier();
>> +        pg_atomic_write_u32(&segP->maxMsgNum, max);
>>
>>          /*
>>           * Now that the maxMsgNum change is globally visible, we give everyone
>>          * a swift kick to make sure they read the newly added messages.
>>          * Releasing SInvalWriteLock will enforce a full memory barrier, so
>>          * these (unlocked) changes will be committed to memory before we exit
>>          * the function.
>>          */
>>         for (i = 0; i < segP->numProcs; i++)
>>         {
>>             ProcState  *stateP = &segP->procState[segP->pgprocnos[i]];
>>
>>             stateP->hasMessages = true;
>>         }
> 
> On a loosely ordered architecture, the hasMessage=false in SIGetDataEntries()
> could be reordered with the read of maxMsgNum. Which, afaict, would lead to
> missing messages. That's not prevented by the pg_write_barrier() in
> SIInsertDataEntries().  I think there may be other similar dangers.
> 
> This could be solved by adding full memory barriers in a few places.

It is quite interesting remark, because SpinLockAcquire may be implemented
as `__sync_lock_test_and_set(lock, 1)` [1] on ARM/ARM64, and
`__sync_lock_test_and_set` has only "acquire barrier" semantic as GCC's
documentation says [2] . More over, `__sync_lock_release` used in this case
also provides only "release barrier" semantic.

Therefore, concerning these facts, current code state also doesn't give
guarantee `stateP->hasMessage = false` will not be reordered with `max =
segP->maxMsgNum`. Nor following read of messages are guaranteed to happen
after `max = segP->maxMsgNum`.

Given your expectations for SpinLockAcquire and SpinLockRelease to be full
memory barriers, and probably numerous usages of them as such, their
current implementation on ARM/ARM64 looks to be completely broken, imho.

[1]

https://github.com/postgres/postgres/blob/b3754dcc9ff7aba2268e98ecf4b5546353305cd2/src/include/storage/s_lock.h#L244-L264

[2]

https://gcc.gnu.org/onlinedocs/gcc-15.1.0/gcc/_005f_005fsync-Builtins.html#index-_005f_005fsync_005flock_005ftest_005fand_005fset

-- 
regards
Yura Sokolov aka funny-falcon



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Add an option to skip loading missing publication to avoid logical replication failure
Next
From: Laurenz Albe
Date:
Subject: Re: PATCH: pg_dump to support "on conflict do update"