Re: Trying out - Mailing list pgsql-hackers

From Greg Burd
Subject Re: Trying out
Date
Msg-id CC76554F-41AC-45AA-AF10-370FEC416498@greg.burd.me
Whole thread Raw
In response to Re: Trying out  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
On Nov 23 2025, at 4:08 pm, Thomas Munro <thomas.munro@gmail.com> wrote:

> On Mon, Nov 24, 2025 at 4:23 AM Greg Burd <greg@burd.me> wrote:
>> As mentioned on a separate thread about fixing ARM64 support when
>> building with MSVC on Win11 [1] I tried out this patch.  The reply on
>> that thread had an issue with _mm_pause() in spin_delay(), it turns
>> out we need to use __yield() [2].  I went ahead and fixed that, so
>> ignore that patch on the other thread [1].  The new patch attached
>> that layers on top of your work and supports that platform, there was
>> one minor change that was required:
>>
>>
>> #ifdef _MSC_VER
>>
>>         /*
>>          * If using Visual C++ on Win64, inline assembly is
>> unavailable.  Use a
>>          * _mm_pause intrinsic instead of rep nop. For ARM64, use the __yield()
>>          * intrinsic which emits the YIELD instruction as a hint to
>> the processor.
>>          */
>> #if defined(_M_ARM64)
>>         __yield();
>> #elif defined(_WIN64)
>>         _mm_pause();
>> #else
>>         /* See comment for gcc code. Same code, MASM syntax */
>>         __asm           rep nop;
>> #endif
>> #endif                                                  /* _MSC_VER */
>
> That makes more intuitive sense... but I didn't know that people *do*
> sometimes prefer instruction synchronisation barriers for spinlock
> delays:
>
> https://stackoverflow.com/questions/70810121/why-does-hintspin-loop-use-isb-on-aarch64
>
> When reading your patch I was pretty confused by that, because it said
> it was fixing a barrier problem and apparently doing so in an
> unprincipled place.  I guess we really need to research the best delay
> mechanism for our needs on this architecture, independently of the
> compiler being used, and then write matching GCC and Visual Studio
> versions of that?  I think there were some threads about spinlock
> performance on Linux + Graviton with graphs and theories...

Interesting, I think I was rushing to get past that compile issue rather
than optimizing.  This sounds like yet another place where we should
choose based on arch and it seems hint::spin_loop() does.

>> tests are passing, best.
>
> Great news!  Thanks.  It sounds like if I could supply the missing
> credible evidence of codegen quality on... all the computers, then I
> think we'd be down to just: when can we pull the trigger and require
> Visual Studio 2022 and do we trust /experimental:c11atomics?

I'm in favor of the stdatomic approach.  I can't speak to codegen
quality on *all the platforms* or how *experimental* c11 atomics are
when using MSVC.

> FTR I had earlier shared some version of this patch with Dave when he
> was trying to get his Windows/ARM system going, but I think my earlier
> version was probably too broken.  Sorry Dave.  At that stage I was
> also trying to do it as an option but keeping the existing stuff
> around.  Since then we adopted C11, so this is the all-in version.  I
> also hadn't understood a key part of the C11 memory model that your
> RISC-V animal taught me and that c5d34f4a fixed, and you can see in
> this patch set too, and I'm not sure if Visual Studio is like GCC or
> Clang in that respect.

Thanks for that work on RISC-V, I appreciate that!  Much more digging to
be done to answer those questions for sure.

> It crossed my mind that this might even be
> related to the problem you've noticed with barriers being missing, but
> I haven't looked into that.  BTW I believe we could actually change
> our code NOT to rely on that, ie to follow the C11 memory model better
> and declare eg PgAioHandle::status as atomic_uint8 or whatever (other
> non-atomic access would be considered dependent and do the right thing
> IIUC), but I'm not sure if it's necessary and that research project
> can wait.

Interesting.  Yeah, let's do one thing and then move to the next, but I
do like the idea.

best.

-greg



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Trying out
Next
From: David Rowley
Date:
Subject: Re: Adjust comments for `IndexOptInfo` to accurately reflect indexcollations's length