> Hm. Any idea why that is? I wonder if the compiler isn't using as many
> SVE registers as it could for this.
Not sure, we tried forcing loop unrolling using the below line in the MakeFile
but the results are the same.
pg_popcount_sve.o: CFLAGS += ${CFLAGS_UNROLL_LOOPS} -march=native
> I've also noticed that the latest patch doesn't compile on my M3 macOS
> machine. After a quick glance, I think the problem is that the
> TRY_POPCNT_FAST macro is set, so it's trying to compile the assembly
> versions.
Fixed, we tried using the existing "choose" logic guarded by TRY_POPCNT_FAST.
The latest patch bypasses TRY_POPCNT_FAST by having a separate choose logic
for aarch64.
-Chiranmoy