Re: Popcount optimization using AVX512 - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Popcount optimization using AVX512
Date
Msg-id 20240313163904.GA2560056@nathanxps13
Whole thread Raw
In response to Re: Popcount optimization using AVX512  (Nathan Bossart <nathandbossart@gmail.com>)
Responses RE: Popcount optimization using AVX512
List pgsql-hackers
A couple of thoughts on v7-0001:

+extern int  pg_popcount32_slow(uint32 word);
+extern int  pg_popcount64_slow(uint64 word);

+/* In pg_popcnt_*_accel source file. */
+extern int  pg_popcount32_fast(uint32 word);
+extern int  pg_popcount64_fast(uint64 word);

Can these prototypes be moved to a header file (maybe pg_bitutils.h)?  It
looks like these are defined twice in the patch, and while I'm not positive
that it's against project policy to declare extern function prototypes in
.c files, it appears to be pretty rare.

+  'pg_popcnt_choose.c',
+  'pg_popcnt_x86_64_accel.c',

I think we want these to be architecture-specific, i.e., only built for
x86_64 if the compiler knows how to use the relevant instructions.  There
is a good chance that we'll want to add similar support for other systems.
The CRC32C files are probably a good reference point for how to do this.

+#ifdef TRY_POPCNT_FAST

IIUC this macro can be set if either 1) the popcntq test in the
autoconf/meson scripts passes or 2) we're building with MSVC on x86_64.  I
wonder if it would be better to move the MSVC/x86_64 check to the
autoconf/meson scripts so that we could avoid surrounding large portions of
the popcount code with this macro.  This might even be a necessary step
towards building these files in an architecture-specific fashion.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Next
From: Bharath Rupireddy
Date:
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation