Re: Popcount optimization using AVX512 - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Popcount optimization using AVX512
Date
Msg-id 20240406194101.GA533391@nathanxps13
Whole thread Raw
In response to Re: Popcount optimization using AVX512  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Popcount optimization using AVX512
Re: Popcount optimization using AVX512
List pgsql-hackers
On Sat, Apr 06, 2024 at 02:51:39PM +1300, David Rowley wrote:
> On Sat, 6 Apr 2024 at 14:17, Nathan Bossart <nathandbossart@gmail.com> wrote:
>> On Sat, Apr 06, 2024 at 12:08:14PM +1300, David Rowley wrote:
>> > Won't Valgrind complain about this?
>> >
>> > +pg_popcount_avx512(const char *buf, int bytes)
>> >
>> > + buf = (const char *) TYPEALIGN_DOWN(sizeof(__m512i), buf);
>> >
>> > + val = _mm512_maskz_loadu_epi8(mask, (const __m512i *) buf);
>>
>> I haven't been able to generate any complaints, at least with some simple
>> tests.  But I see your point.  If this did cause such complaints, ISTM we'd
>> just want to add it to the suppression file.  Otherwise, I think we'd have
>> to go back to the non-maskz approach (which I really wanted to avoid
>> because of the weird function overhead juggling) or find another way to do
>> a partial load into an __m512i.
> 
> [1] seems to think it's ok.  If this is true then the following
> shouldn't segfault:
> 
> The following seems to run without any issue and if I change the mask
> to 1 it crashes, as you'd expect.

Cool.

Here is what I have staged for commit, which I intend to do shortly.  At
some point, I'd like to revisit converting TRY_POPCNT_FAST to a
configure-time check and maybe even moving the "fast" and "slow"
implementations to their own files, but since that's mostly for code
neatness and we are rapidly approaching the v17 deadline, I'm content to
leave that for v18.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: Michal Bartak
Date:
Subject: [MASSMAIL]CASE control block broken by a single line comment
Next
From: Andres Freund
Date:
Subject: Re: Flushing large data immediately in pqcomm