On Nov 20 2025, at 7:03 pm, Andres Freund <andres@anarazel.de> wrote:
> Hi,
>
> On 2025-11-20 15:45:22 -0500, Greg Burd wrote:
>> Dave and I have been working together to get ARM64 with MSVC functional.
>> The attached patches accomplish that. Dave is the author of the first
>> which addresses some build issues and fixes the spin_delay() semantics,
>> I did the second which fixes some atomics in this combination.
>
> Thanks for working on this!
You're welcome, thanks for reviewing it. :)
>>
>> MSVC's _InterlockedCompareExchange() intrinsic on ARM64 performs the
>> atomic operation but does NOT emit the necessary Data Memory Barrier
>> (DMB) instructions [4][5].
>
> I couldn't reproduce this result when playing around on godbolt. By specifying
> /arch:armv9.4 msvc can be convinced to emit the code for the
> intrinsics inline
> (at least for most of them). And that makes it visible that
> _InterlockedCompareExchange() results in a "casal" instruction.
> Looking that
> up shows:
>
https://developer.arm.com/documentation/dui0801/l/A64-Data-Transfer-Instructions/CASA--CASAL--CAS--CASL--CASAL--CAS--CASL--A64-
> which includes these two statements:
> "CASA and CASAL load from memory with acquire semantics."
> "CASL and CASAL store to memory with release semantics."
I didn't even think to check for a compiler flag for the architecture,
nice call! If this emits the correct instructions it is a much better
approach. I'll give it a try, thanks for the nudge.
>> Issue 2: S_UNLOCK() uses only a compiler barrier
>>
>> _ReadWriteBarrier() is a compiler barrier, NOT a hardware memory
>> barrier [6]. It prevents the compiler from reordering operations, but
>> the CPU can still reorder memory operations. This is fundamentally
>> insufficient for ARM64's weaker memory model.
>
> Yea, that seems broken on a non-TSO architecture. Is the problem
> fixed if you change just this to include a proper barrier?
Using the flag from above the _ReadWriteBarrier() does (in godbolt) turn
into a casal which (AFAIK) is going to do the trick. I'll see if I can
update meson.build and get this work as intended.
> Greetings,
>
> Andres Freund
best.
-greg