Remove Instruction Synchronization Barrier in spin_delay() for ARM64 architecture - Mailing list pgsql-hackers

From Salvatore Dipietro
Subject Remove Instruction Synchronization Barrier in spin_delay() for ARM64 architecture
Date
Msg-id CAGnuAhW+b3smF7jGP7ijhdoUc3w1tw462OqTyEpwJtpmjaAtwA@mail.gmail.com
Whole thread Raw
List pgsql-hackers
Hi,
we would like to propose the removal of the Instruction
Synchronization Barrier (isb) for aarch64 architectures. Based on our
testing on Graviton instances (m7g.16xlarge), we can see on average
over multiple iterations up to 12% better performance using PGBench
select-only and up to 9% with Sysbench oltp_read_only workloads. On
Graviton4 (m8g.24xlarge) results are up to 8% better using PGBench
select-only and up to 6% with Sysbench oltp_read_only workloads.
We have also tested it putting more pressure on the spin_delay
function, enabling pg_stat_statements.track_planning with PGBench
read-only [0] and, on average, the patch shows up to 27% better
performance on m6g.16xlarge and up to 37% on m7g.16xlarge.

Testing environment:
- PostgreSQL version: 17.2
- Operating System: Ubuntu 22.04
- Test Platform: AWS Graviton instances (m6g.16xlarge, m7g.16xlarge
and m8g.24xlarge)

Our benchmark results on PGBench select-only without
pg_stat_statements.track_planning:
```
# Load DB on m7g.16xlarge
$ pgbench -i --fillfactor=90 --scale=5644 --host=172.31.32.85
--username=postgres pgtest

# Without patch
$ pgbench --host 172.31.32.85 --username=postgres --protocol=prepared
-P 10 -b select-only --time=600 --client=256 --jobs=96 pgtest
...
    "transaction type: <builtin: select only>",
    "scaling factor: 5644",
    "query mode: prepared",
    "number of clients: 256",
    "number of threads: 96",
    "duration: 600 s",
    "number of transactions actually processed: 359864937",
    "latency average = 0.420 ms",
    "latency stddev = 1.755 ms",
    "tps = 599770.727316 (including connections establishing)",
    "tps = 599826.788919 (excluding connections establishing)"


# With patch
$ pgbench --host 172.31.32.85 --username=postgres --protocol=prepared
-P 10 -b select-only --time=600 --client=256 --jobs=96 pgtest
...
    "transaction type: <builtin: select only>",
    "scaling factor: 5644",
    "query mode: prepared",
    "number of clients: 256",
    "number of threads: 96",
    "duration: 600 s",
    "number of transactions actually processed: 405891881",
    "latency average = 0.371 ms",
    "latency stddev = 0.569 ms",
    "tps = 676480.900049 (including connections establishing)",
    "tps = 676523.557293 (excluding connections establishing)"
```

[0] https://www.postgresql.org/message-id/ZxgDEb_VpWyNZKB_%40nathan

Attachment

pgsql-hackers by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: alphabetize long options in pg_dump[all] docs
Next
From: Nathan Bossart
Date:
Subject: Re: alphabetize long options in pg_dump[all] docs