Re: Use POPCNT on MSVC - Mailing list pgsql-hackers

From John Naylor
Subject Re: Use POPCNT on MSVC
Date
Msg-id CAFBsxsHfGai31H2_ha2ZAMmGEZbkDgCop8OzdR05AX6CQkWvtQ@mail.gmail.com
Whole thread Raw
In response to Re: Use POPCNT on MSVC  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Use POPCNT on MSVC  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
On Tue, Aug 3, 2021 at 11:36 PM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Tue, 3 Aug 2021 at 22:43, John Naylor <john.naylor@enterprisedb.com> wrote:
> > 1. the __popcnt64() intrinsic is put inside pg_popcount64_asm(), which is a bit of a misnomer since it's not assembly. Renaming s/_asm/_fast/ would help it look better. But then looking around, other platforms have intrinsics coded, but for some reason they're put in pg_popcount64_slow(), where the compiler will emit instructions we could easily write ourselves in C (and without #ifdefs) since without the right CFLAGS these intrinsics won't emit a popcnt instruction. Is MSVC different in that regard? If so, it might be worth a comment.
>
> Yeah, the names no longer fit so well after stuffing the MSVC
> intrinsic into the asm function.  The reason I did it that way is down
> to what I read in the docs. Namely:
>
> "If you run code that uses these intrinsics on hardware that doesn't
> support the popcnt instruction, the results are unpredictable."
>
> So, it has to go somewhere where we're sure the CPU supports POPCNT
> and that seemed like the correct place.
>
> From what I understand of GCC and __builtin_popcountl(), the code
> it'll output will depend on the -mpopcnt flag.  So having
> __builtin_popcountl() does not mean the popcnt instruction will be
> used. The Microsoft documentation indicates that's not the case for
> __popcnt().

Okay, "unpredictable" sounds bad.

> > 2. (defined(_MSC_VER) && defined(_WIN64)  lead to a runtime check for the CPUID, which is fine, but now next to it HAVE_X86_64_POPCNTQ looks strange because the latter symbol comes from a specific configure test -- the two don't seem equivalent, but maybe they are because of what MSVC does? That would be nice to spell out here.
>
> The problem there is that we define HAVE_X86_64_POPCNTQ based on the
> outcome of configure so it does not get set for MSVC.  Maybe it's
> worth coming up with some other constant that can be defined or we
> could just do:
>
> #if defined(_MSC_VER) && defined(_WIN64)
> #define HAVE_X86_64_POPCNTQ
> #endif

That seems fine. I don't know PG can build with Arm on Windows, but for the cpuid to work, it seems safer to also check for __x86_64?

> I think the reason for the asm is that __builtin_popcountl does not
> mean popcnt will be used. Maybe we could have done something like
> compile pg_bitutils.c with -mpopcnt, and have kept the run-time check,
> but the problem there is that means that the compiler might end up
> using that instruction in some other function in that file that we
> don't want it to. It looks like my patch in [1] did pass the -mpopcnt
> flag which Tom fixed.

Ah, that makes sense. (If we someday offer a configure option for x86-64-v2, we can use intrinsics in the asm functions, and call them directly. Yet another different thread.)

--
John Naylor
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: A varint implementation for PG?
Next
From: Simon Riggs
Date:
Subject: Re: Lowering the ever-growing heap->pd_lower