Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ?barriers - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ?barriers
Date
Msg-id qu3uuja2u3uerwmbhvz6pscdhlhdm3xa3d4nc7lr7vvpowarzl@yn45ne6635ml
Whole thread Raw
In response to Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ?barriers  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ?barriers
Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ?barriers
List pgsql-hackers
Hi,

On 2025-12-15 15:38:49 -0600, Nathan Bossart wrote:
> +++ b/meson.build
> @@ -2523,7 +2523,8 @@ int main(void)
>  }
>  '''
>  
> -  if cc.links(prog, name: '__crc32cb, __crc32ch, __crc32cw, and __crc32cd without -march=armv8-a+crc',
> +  if (host_cpu == 'aarch64' and cc.get_id() == 'msvc') or \
> +        cc.links(prog, name: '__crc32cb, __crc32ch, __crc32cw, and __crc32cd without -march=armv8-a+crc',
>        args: test_c_args)
>      # Use ARM CRC Extension unconditionally
>      cdata.set('USE_ARMV8_CRC32C', 1)

I still think this should have a comment explaining that we can
unconditionally rely on crc32 support due to window's baseline requirements.


> diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
> index 7f8f566bd40..e62141abf0a 100644
> --- a/src/include/storage/s_lock.h
> +++ b/src/include/storage/s_lock.h
> @@ -602,13 +602,21 @@ typedef LONG slock_t;
>  
>  #define SPIN_DELAY() spin_delay()
>  
> -/* If using Visual C++ on Win64, inline assembly is unavailable.
> - * Use a _mm_pause intrinsic instead of rep nop.
> - */
> -#if defined(_WIN64)
> +#ifdef _M_ARM64
> +static __forceinline void
> +spin_delay(void)
> +{
> +    /* Research indicates ISB is better than __yield() on AArch64. */
> +    __isb(_ARM64_BARRIER_SY);

It'd be good to link the research in some form or another. Otherwise it's
harder to evolve the code in the future, because we don't know if the research
was "I liked the color better" or "one is catastrophically slower than the
other".

> +}
> +#elif defined(_WIN64)
>  static __forceinline void
>  spin_delay(void)
>  {
> +    /*
> +     * If using Visual C++ on Win64, inline assembly is unavailable.
> +     * Use a _mm_pause intrinsic instead of rep nop.
> +     */
>      _mm_pause();
>  }
>  #else
> @@ -621,12 +629,19 @@ spin_delay(void)
>  #endif
>  
>  #include <intrin.h>
> +#ifdef _M_ARM64
> +#pragma intrinsic(_InterlockedExchange)
> +
> +/* _ReadWriteBarrier() is insufficient on non-TSO architectures. */
> +#define S_UNLOCK(lock) _InterlockedExchange(lock, 0)
> +#else
>  #pragma intrinsic(_ReadWriteBarrier)
>  
>  #define S_UNLOCK(lock)    \
>      do { _ReadWriteBarrier(); (*(lock)) = 0; } while (0)
>  
>  #endif
> +#endif

The newline placement looks odd here. I'd add newlines around the #else.



Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Changing the state of data checksums in a running cluster
Next
From: Nathan Bossart
Date:
Subject: Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ?barriers