Thread: pgsql: Optimize popcount functions with ARM SVE intrinsics.
Optimize popcount functions with ARM SVE intrinsics. This commit introduces SVE implementations of pg_popcount{32,64}. Unlike the Neon versions, we need an additional configure-time check to determine if the compiler supports SVE intrinsics, and we need a runtime check to determine if the current CPU supports SVE instructions. Our testing showed that the SVE implementations are much faster for larger inputs and are comparable to the status quo for smaller inputs. Author: "Devanga.Susmitha@fujitsu.com" <Devanga.Susmitha@fujitsu.com> Co-authored-by: "Chiranmoy.Bhattacharya@fujitsu.com" <Chiranmoy.Bhattacharya@fujitsu.com> Co-authored-by: "Malladi, Rama" <ramamalladi@hotmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/010101936e4aaa70-b474ab9e-b9ce-474d-a3ba-a3dc223d295c-000000%40us-west-2.amazonses.com Discussion: https://postgr.es/m/OSZPR01MB84990A9A02A3515C6E85A65B8B2A2%40OSZPR01MB8499.jpnprd01.prod.outlook.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/519338ace410d9b1ffb13176b8802b0307ff0531 Modified Files -------------- config/c-compiler.m4 | 52 ++++++++ configure | 71 +++++++++++ configure.ac | 9 ++ meson.build | 48 +++++++ src/include/pg_config.h.in | 3 + src/include/port/pg_bitutils.h | 17 +++ src/port/pg_popcount_aarch64.c | 281 ++++++++++++++++++++++++++++++++++++++++- 7 files changed, 475 insertions(+), 6 deletions(-)