RE: SIMD optimization for list_sort - Mailing list pgsql-hackers

From R, Rakshit
Subject RE: SIMD optimization for list_sort
Date
Msg-id CO1PR11MB5060E34D88194334B7F708E9EDCC2@CO1PR11MB5060.namprd11.prod.outlook.com
Whole thread Raw
In response to RE: SIMD optimization for list_sort  ("Devulapalli, Raghuveer" <raghuveer.devulapalli@intel.com>)
Responses Re: SIMD optimization for list_sort
List pgsql-hackers
Hi All,

Thank you so much for the feedback. 

> I don't think "another extension might use it someday" makes a very strong case, 
> particularly for something that requires a new dependency.

The x86-simdsort library is an optional dependency in Postgres. Also the new list sort implementation which uses the
x86-simdsortlibrary does not impact any of the existing workflows in Postgres. We have at least one use case where
Pgvectorgets a gain of 7 - 10 % in HNSW index build time using the SIMD sort implementation. Hence we feel that this
patchcould be up streamed further. Open to further discussion on the same.
 


> Note that our current implemention is highly optimized for low-cardinality inputs.
> This is needed for aggregate queries. I found this write-up of a 
> couple scalar and vectorized sorts, and they show this library doing 
> very poorly on very-low cardinality inputs. I would look into that 
> before trying to get something in shape to share.
> 
> https://github.com/Voultapher/sort-research-
> rs/blob/main/writeup/intel_avx512/text.md

We ran our extension to stress list sort with low cardinality inputs. For eg, for an array of size 100k having repeated
valuesin the range 1-10 we still see a gain of around 20% in throughput.
 
We will collect more data for low cardinality inputs and with AVX2 too.

Thanks and regards
Rakshit Rajeev

pgsql-hackers by date:

Previous
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: ReplicationSlotRelease() crashes when the instance is in the single user mode
Next
From: Tom Lane
Date:
Subject: Re: Remove extra Sort node above a btree-compatible Index Scan