Re: Popcount optimization using AVX512 - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Popcount optimization using AVX512
Date
Msg-id 20240401221512.GA2511488@nathanxps13
Whole thread Raw
In response to Re: Popcount optimization using AVX512  (Ants Aasma <ants.aasma@cybertec.at>)
List pgsql-hackers
On Tue, Apr 02, 2024 at 01:09:57AM +0300, Ants Aasma wrote:
> On Tue, 2 Apr 2024 at 00:31, Nathan Bossart <nathandbossart@gmail.com> wrote:
>> On Tue, Apr 02, 2024 at 12:11:59AM +0300, Ants Aasma wrote:
>> > What about using the masking capabilities of AVX-512 to handle the
>> > tail in the same code path? Masked out portions of a load instruction
>> > will not generate an exception. To allow byte level granularity
>> > masking, -mavx512bw is needed. Based on wikipedia this will only
>> > disable this fast path on Knights Mill (Xeon Phi), in all other cases
>> > VPOPCNTQ implies availability of BW.
>>
>> Sounds promising.  IMHO we should really be sure that these kinds of loads
>> won't generate segfaults and the like due to the masked-out portions.  I
>> searched around a little bit but haven't found anything that seemed
>> definitive.
> 
> Interestingly the Intel software developer manual is not exactly
> crystal clear on how memory faults with masks work, but volume 2A
> chapter 2.8 [1] does specify that MOVDQU8 is of exception class E4.nb
> that supports memory fault suppression on page fault.

Perhaps Paul or Akash could chime in here...

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: Popcount optimization using AVX512
Next
From: Peter Geoghegan
Date:
Subject: Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan