Re: [PATCH] SVE popcount support - Mailing list pgsql-hackers

From Chiranmoy.Bhattacharya@fujitsu.com"
Subject Re: [PATCH] SVE popcount support
Date
Msg-id OSBPR01MB2664639C190F433EFFF65ED397D02@OSBPR01MB2664.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [PATCH] SVE popcount support  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: [PATCH] SVE popcount support
List pgsql-hackers
On Wed, Mar 12, 2025 at 02:41:18AM +0000, nathandbossart@gmail.com wrote:

> v5-no-sve is the result of using a function pointer, but pointing to the
> "slow" versions instead of the SVE version.  v5-sve is the result of the
> latest patch in this thread on a machine with SVE support, and v5-4reg is
> the result of the latest patch in this thread modified to process 4
> register's worth of data at a time.

Nice, I wonder why I did not observe any performance gain in the 4reg
version. Did you modify the 4reg version code?

One possible explanation is that you used Graviton4 based instances
whereas I used Graviton3 instances.

> For the latter point, I think we should consider trying to add a separate
> Neon implementation that we use as a fallback for machines that don't have
> SVE.  My understanding is that Neon is virtually universally supported on
> 64-bit Arm gear, so that will not only help offset the function pointer
> overhead but may even improve performance for a much wider set of machines.

I have added the NEON implementation in the latest patch.

Here are the numbers for drive_popcount(1000000, 1024) on m7g.8xlarge:
Scalar - 692ms
Neon - 298ms
SVE - 112ms

-Chiranmoy
Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Add an option to skip loading missing publication to avoid logical replication failure
Next
From: Peter Eisentraut
Date:
Subject: Re: Index AM API cleanup