I wrote:
> A more useful test would be to directly experiment with contended
> spinlocks. As I recall, we had some test cases laying about when
> we were fooling with the spin delay stuff on Intel --- maybe
> resurrecting one of those would be useful?
The last really significant performance testing we did in this area
seems to have been in this thread:
https://www.postgresql.org/message-id/flat/CA%2BTgmoZvATZV%2BeLh3U35jaNnwwzLL5ewUU_-t0X%3DT0Qwas%2BZdA%40mail.gmail.com
A relevant point from that is Haas' comment
I think optimizing spinlocks for machines with only a few CPUs is
probably pointless. Based on what I've seen so far, spinlock
contention even at 16 CPUs is negligible pretty much no matter what
you do. Whether your implementation is fast or slow isn't going to
matter, because even an inefficient implementation will account for
only a negligible percentage of the total CPU time - much less than 1%
- as opposed to a 64-core machine, where it's not that hard to find
cases where spin-waits consume the *majority* of available CPU time
(recall previous discussion of lseek).
So I wonder whether this patch is getting ahead of the game. It does
seem that ARM systems with a couple dozen cores exist, but are they
common enough to optimize for yet? Can we even find *one* to test on
and verify that this is a win and not a loss? (Also, seeing that
there are so many different ARM vendors, results from just one
chipset might not be too trustworthy ...)
regards, tom lane