Re: [PATCH] SVE popcount support - Mailing list pgsql-hackers
From | Nathan Bossart |
---|---|
Subject | Re: [PATCH] SVE popcount support |
Date | |
Msg-id | Z1B-JzGuEuov8yj4@nathan Whole thread Raw |
In response to | [PATCH] SVE popcount support ("Malladi, Rama" <rvmallad@amazon.com>) |
List | pgsql-hackers |
On Wed, Dec 04, 2024 at 08:51:39AM -0600, Malladi, Rama wrote: > Thank you, Kirill, for the review and the feedback. Please find inline my > reply and an updated patch. Thanks for the updated patch. I have a couple of high-level comments. Would you mind adding this to the commitfest system (https://commitfest.postgresql.org/) so that it is picked up by our automated patch testing tools? > +# Check for ARMv8 SVE popcount intrinsics > +# > +CFLAGS_POPCNT="" > +PG_POPCNT_OBJS="" > +PGAC_SVE_POPCNT_INTRINSICS([]) > +if test x"$pgac_sve_popcnt_intrinsics" != x"yes"; then > + PGAC_SVE_POPCNT_INTRINSICS([-march=armv8-a+sve]) > +fi > +if test x"$pgac_sve_popcnt_intrinsics" = x"yes"; then > + PG_POPCNT_OBJS="pg_popcount_sve.o" > + AC_DEFINE(USE_SVE_POPCNT_WITH_RUNTIME_CHECK, 1, [Define to 1 to use SVE popcount instructions with a runtime check.]) > +fi > +AC_SUBST(CFLAGS_POPCNT) > +AC_SUBST(PG_POPCNT_OBJS) We recently switched some intrinsics support in PostgreSQL to use __attribute__((target("..."))) instead of applying special compiler flags to specific files (e.g., commits f78667b and 4b03a27). The hope is that this approach will be a little more sustainable as we add more architecture-specific code. IMHO we should do something similar here. While this means that older versions of clang might not pick up this optimization (see the commit message for 4b03a27 for details), I think that's okay because 1) this patch is intended for the next major version of Postgres, which will take some time for significant adoption, and 2) this is brand new code, so we aren't introducing any regressions for current users. > +#ifdef USE_SVE_POPCNT_WITH_RUNTIME_CHECK > +extern PGDLLIMPORT uint64 (*pg_popcount_optimized) (const char *buf, int bytes); Could we combine this with the existing copy above this line? I'm thinking of something like #if defined(TRY_POPCNT_FAST) || defined(USE_SVE_POPCNT_WITH_RUNTIME_CHECK) extern PGDLLIMPORT uint64 (*pg_popcount_optimized) (...) #endif #ifdef TRY_POPCNT_FAST ... > +#ifdef USE_SVE_POPCNT_WITH_RUNTIME_CHECK > +extern uint64 pg_popcount_sve(const char *buf, int bytes); > +extern int check_sve_support(void); > +#endif Are we able to use SVE instructions for pg_popcount32(), pg_popcount64(), and pg_popcount_masked(), too? > +static inline uint64 > +pg_popcount_choose(const char *buf, int bytes) > +{ > + if (check_sve_support()) > + pg_popcount_optimized = pg_popcount_sve; > + else > + pg_popcount_optimized = pg_popcount_slow; > + return pg_popcount_optimized(buf, bytes); > +} > + > +#endif /* USE_SVE_POPCNT_WITH_RUNTIME_CHECK */ Can we put this code in the existing choose_popcount_functions() function in pg_bitutils.c? > +// check if sve supported > +int check_sve_support(void) > +{ > + // Read ID_AA64PFR0_EL1 register > + uint64_t pfr0; > + __asm__ __volatile__( > + "mrs %0, ID_AA64PFR0_EL1" > + : "=r" (pfr0)); > + > + // SVE bits are 32-35 > + return (pfr0 >> 32) & 0xf; > +} Is this based on some reference code from a manual that we could cite here? Or better yet, is it possible to do this without inline assembly (e.g., with another intrinsic function)? > +/* > + * pg_popcount_sve > + * Returns the number of 1-bits in buf > + */ > +uint64 > +pg_popcount_sve(const char *buf, int bytes) I think this function could benefit from some additional comments to explain what is happening at each step. -- nathan
pgsql-hackers by date: