On Wed, Apr 30, 2025 at 4:53 AM Salvatore Dipietro
<dipietro.salvatore@gmail.com> wrote:
> we would like to propose the removal of the Instruction
> Synchronization Barrier (isb) for aarch64 architectures. Based on our
> testing on Graviton instances (m7g.16xlarge), we can see on average
> over multiple iterations up to 12% better performance using PGBench
> select-only and up to 9% with Sysbench oltp_read_only workloads. On
> Graviton4 (m8g.24xlarge) results are up to 8% better using PGBench
> select-only and up to 6% with Sysbench oltp_read_only workloads.
> We have also tested it putting more pressure on the spin_delay
> function, enabling pg_stat_statements.track_planning with PGBench
> read-only [0] and, on average, the patch shows up to 27% better
> performance on m6g.16xlarge and up to 37% on m7g.16xlarge.
Hmm. This was added only 3 years ago, supposedly because it made
performance better:
commit a82a5eee314df52f3183cedc0ecbcac7369243b1
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Wed Apr 6 18:57:57 2022 -0400
Use ISB as a spin-delay instruction on ARM64.
This seems beneficial on high-core-count machines, and not harmful
on lesser hardware. However, older ARM32 gear doesn't have this
instruction, so restrict the patch to ARM64.
Geoffrey Blake
Discussion:
https://postgr.es/m/78338F29-9D7F-4DC8-BD71-E9674CE71425@amazon.com
I think you should make some kind of argument about why the previous
conclusion was wrong, or why something's changed between then and now.
--
Robert Haas
EDB: http://www.enterprisedb.com