Re: [PATCH] CRC32C optimizations using SVE2 on ARM. - Mailing list pgsql-hackers
| From | Devanga.Susmitha@fujitsu.com |
|---|---|
| Subject | Re: [PATCH] CRC32C optimizations using SVE2 on ARM. |
| Date | |
| Msg-id | OSZPR01MB84994FEF43BA834F706221A78BB0A@OSZPR01MB8499.jpnprd01.prod.outlook.com Whole thread Raw |
| In response to | Re: [PATCH] CRC32C optimizations using SVE2 on ARM. (John Naylor <johncnaylorls@gmail.com>) |
| Responses |
Re: [PATCH] CRC32C optimizations using SVE2 on ARM.
|
| List | pgsql-hackers |
>There was already a proposal to use armv8-a+crypto, which is more
widely available and works on smaller inputs.
Our implementation with SVE2 is able to gain better performance than
https://www.postgresql.org/message-id/CANWCAZaKhE%2BRD5KKouUFoxx1EbUNrNhcduM1VQ%3DDkSDadNEFng%40mail.gmail.com
I've benchmarked our SVE2 implementation against armv8-a+crypto, and the results show substantial improvements.
widely available and works on smaller inputs.
Our implementation with SVE2 is able to gain better performance than
https://www.postgresql.org/message-id/CANWCAZaKhE%2BRD5KKouUFoxx1EbUNrNhcduM1VQ%3DDkSDadNEFng%40mail.gmail.com
I've benchmarked our SVE2 implementation against armv8-a+crypto, and the results show substantial improvements.
Buffer size (bytes) | armv8+crypto (in ms) | armv9+SVE2 (in ms) | Improvement |
512 | 28.491 | 19.37 | 32.0% faster |
1024 | 47.145 | 29.962 | 36.5% faster |
2048 | 86.717 | 52.841 | 39.1% faster |
4096 | 165.205 | 105.626 | 36.1% faster |
8192 | 318.103 | 226.437 | 28.8% faster |
These buffer sizes are particularly relevant for PostgreSQL workloads:
- 8KB: Default page size (28.8% faster checksumming)
- 4KB: Alternative page size configuration (36.1% faster)
- 512B-2KB: Typical WAL record sizes (32-39% faster)
- 2KB: TOAST chunk size (39% faster)
While armv8-a+crypto has broader current deployment, SVE2 is already available in production cloud infrastructure: AWS Graviton 4, Ampere AmpereOne, and NVIDIA Grace (all released 2023). As ARMv9 adoption continues, these gains become increasingly relevant.
Rather than choosing one approach over the other, perhaps we could implement both with runtime CPU detection? Since we already perform runtime detection for crypto extension availability, adding an additional check for SVE2 introduces no performance degradation on systems without SVE2, while providing significant performance gains (28-39%) on systems that do support it. This would provide optimal performance on capable hardware while maintaining broad compatibility. Please let me know your thoughts.
static pg_crc32c (*pg_comp_crc32c_armv8)(pg_crc32c crc, const void *data, size_t len);
static pg_crc32c (*pg_comp_crc32c_armv8)(pg_crc32c crc, const void *data, size_t len);
void pg_comp_crc32c_choose_armv8(void)
{
if (pg_cpu_has_sve2())
pg_comp_crc32c_armv8 = pg_comp_crc32c_armv8_sve2;
else if (pg_cpu_has_crypto())
pg_comp_crc32c_armv8 = pg_comp_crc32c_armv8_crypto;
else
pg_comp_crc32c_armv8 = pg_comp_crc32c_sb8; // scalar fallback
}
Thanks,
Susmitha Devanga.
Susmitha Devanga.
From: John Naylor <johncnaylorls@gmail.com>
Sent: Friday, December 19, 2025 08:27
To: Susmitha, Devanga <Devanga.Susmitha@fujitsu.com>
Cc: pgsql-hackers <pgsql-hackers@postgresql.org>; Hajela, Ragesh <Ragesh.Hajela@fujitsu.com>; Bhattacharya, Chiranmoy <Chiranmoy.Bhattacharya@fujitsu.com>
Subject: Re: [PATCH] CRC32C optimizations using SVE2 on ARM.
Sent: Friday, December 19, 2025 08:27
To: Susmitha, Devanga <Devanga.Susmitha@fujitsu.com>
Cc: pgsql-hackers <pgsql-hackers@postgresql.org>; Hajela, Ragesh <Ragesh.Hajela@fujitsu.com>; Bhattacharya, Chiranmoy <Chiranmoy.Bhattacharya@fujitsu.com>
Subject: Re: [PATCH] CRC32C optimizations using SVE2 on ARM.
On Fri, Dec 19, 2025 at 4:20 AM Devanga.Susmitha@fujitsu.com
<Devanga.Susmitha@fujitsu.com> wrote:
> For architecture-specific functions, we use pg_attribute_target("arch=armv9-a+sve2-aes")
There was already a proposal to use armv8-a+crypto, which is more
widely available and works on smaller inputs. Perhaps you'd be
interested in reviewing and testing?
https://www.postgresql.org/message-id/CANWCAZaKhE%2BRD5KKouUFoxx1EbUNrNhcduM1VQ%3DDkSDadNEFng%40mail.gmail.com
> to ensure precise compilation control without modifying global CFLAGS, enabling a clean integration within PostgreSQL’s build system.
I think the reason we continue to use CFLAGS here was that clang
support for target attributes on Arm is fairly recent. It's probably
too soon to reconsider that.
--
John Naylor
Amazon Web Services
<Devanga.Susmitha@fujitsu.com> wrote:
> For architecture-specific functions, we use pg_attribute_target("arch=armv9-a+sve2-aes")
There was already a proposal to use armv8-a+crypto, which is more
widely available and works on smaller inputs. Perhaps you'd be
interested in reviewing and testing?
https://www.postgresql.org/message-id/CANWCAZaKhE%2BRD5KKouUFoxx1EbUNrNhcduM1VQ%3DDkSDadNEFng%40mail.gmail.com
> to ensure precise compilation control without modifying global CFLAGS, enabling a clean integration within PostgreSQL’s build system.
I think the reason we continue to use CFLAGS here was that clang
support for target attributes on Arm is fairly recent. It's probably
too soon to reconsider that.
--
John Naylor
Amazon Web Services
pgsql-hackers by date: