Re: SIMD optimization for list_sort - Mailing list pgsql-hackers

From John Naylor
Subject Re: SIMD optimization for list_sort
Date
Msg-id CANWCAZbppFDkjg_zzjW=2JFyecOFVO+1ACy3TV_UDwEF-ygL4g@mail.gmail.com
Whole thread Raw
In response to SIMD optimization for list_sort  ("Giacchino, Luca" <luca.giacchino@intel.com>)
Responses RE: SIMD optimization for list_sort
List pgsql-hackers
On Tue, Feb 18, 2025 at 1:49 PM R, Rakshit <rakshit.r@intel.com> wrote:
>
> Hi All,
>
> Thank you very much for your feedback! We investigated the recommendations as we developed the current patch. Please
referto the comments below. 
>
> > I'd suggest targeting pg_qsort() directly, instead of list_sort(). list_sort() is not used in very performance
criticalparts. 
> Using x86-simd-sort would require a new pg_qsort variant taking an extractor function. The new API would still have
tobubble up to functions like list_sort, which internally call pg_qsort. We targeted list_sort first, as we were aware
ofat least one use case that benefits from SIMD sorting (pgvector). As we target more use cases (such as tuple sort),
wemay need to move the SIMD-enabled implementation to a common location, and a new variant of pg_qsort may be a good
candidate.
>
> > I'd suggest proposing this to the pgvector project directly, since pgvector would immediately benefit.
> Yes, we’ll share the performance gains on HNSW index build time with the pgvector community. However, other users of
list_sort(e.g., other extensions) may benefit from the simd-sort version of list_sort as well. As it is not a
pgvector-specificoptimization, we propose it for PostgreSQL. 

I don't think "another extension might use it someday" makes a very
strong case, particularly for something that requires a new
dependency.

> > If you could use this to speed up tuple sorting, that would be much more interesting for PostgreSQL itself.
> Enabling x86-simd-sort for tuple sorting is in development. We observed promising results, but there is still work to
do(e.g., to ensure there are no regressions in different conditions). We are planning to share another patch for tuple
sortbuilding on this one. 

Note that our current implemention is highly optimized for
low-cardinality inputs. This is needed for aggregate queries. I found
this write-up of a couple scalar and vectorized sorts, and they show
this library doing very poorly on very-low cardinality inputs. I would
look into that before trying to get something in shape to share.

https://github.com/Voultapher/sort-research-rs/blob/main/writeup/intel_avx512/text.md

There is also the question of hardware support. It seems AVX-512 is
not supported well on client side, where most developers work. And
availability of any flavor is not guaranteed on server either.
Something to keep in mind.

--
John Naylor
Amazon Web Services



pgsql-hackers by date:

Previous
From: Dagfinn Ilmari Mannsåker
Date:
Subject: Re: test_escape: invalid option -- 'c'
Next
From: Laurenz Albe
Date:
Subject: Re: Bug in nbtree SAOP scans with non-required arrays, truncated high key