Re: Improving spin-lock implementation on ARM. - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Improving spin-lock implementation on ARM.
Date
Msg-id CAPpHfdsKRz7=yUoyrM5Dw5QVgN3NXzoF-mgeWqbJiv6NiRNSZw@mail.gmail.com
Whole thread Raw
In response to Re: Improving spin-lock implementation on ARM.  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Improving spin-lock implementation on ARM.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Fri, Nov 27, 2020 at 1:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Krunal Bauskar <krunalbauskar@gmail.com> writes:
> > On Thu, 26 Nov 2020 at 10:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Also, exactly what hardware/software platform were these curves
> >> obtained on?
>
> > Hardware: ARM Kunpeng 920 BareMetal Server 2.6 GHz. 64 cores (56 cores for
> > server and 8 for client) [2 numa nodes]
> > Storage: 3.2 TB NVMe SSD
> > OS: CentOS Linux release 7.6
> > PGSQL: baseline = Release Tag 13.1
>
> Hmm, might not be the sort of hardware ordinary mortals can get their
> hands on.  What's likely to be far more common ARM64 hardware in the
> near future is Apple's new gear.  So I thought I'd try this on the new
> M1 mini I just got.
>
> ... and, after retrieving my jaw from the floor, I present the
> attached.  Apple's chips evidently like this style of spinlock a LOT
> better.  The difference is so remarkable that I wonder if I made a
> mistake somewhere.  Can anyone else replicate these results?

Results look very surprising to me.  I didn't expect there would be
any very busy spin-lock when the number of clients is as low as 4.
Especially in read-only pgbench.

I don't have an M1 at hand.  Could you do some profiling to identify
the source of such a huge difference.

------
Regards,
Alexander Korotkov



pgsql-hackers by date:

Previous
From: "tsunakawa.takay@fujitsu.com"
Date:
Subject: RE: Disable WAL logging to speed up data loading
Next
From: Alexander Korotkov
Date:
Subject: Re: Improving spin-lock implementation on ARM.