Re: refactor architecture-specific popcount code - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: refactor architecture-specific popcount code
Date
Msg-id aX0jn2bo-xDaSlH5@nathan
Whole thread Raw
In response to Re: refactor architecture-specific popcount code  (John Naylor <johncnaylorls@gmail.com>)
List pgsql-hackers
On Fri, Jan 30, 2026 at 03:22:45PM +0700, John Naylor wrote:
> 0001 - I'm pretty sure this is comparable to HEAD if the optimized
> function is pg_popcount_sse42(). Has the AVX512 version been tested
> with 8-byte inputs? That seems to have a lot of pre- and
> post-processing involved. The inline wrapper only bypasses for 7 or
> less bytes.

Here [0] is the latest perf data I see for the AVX-512 popcount patch,
although that's comparing to v16, which IIRC lacks a few other inlining
tricks.  There's a chance the SSE4.2 version is faster at that particular
length.  I'm not sure we need to worry about that, but I can do a bit of
testing if you'd like.

> 0002
> - I tried running this on x86-64 with alignment sanitizer and no
> alarms went off during "make check", but adding
> pg_attribute_no_sanitize_alignment() would prevent surprises in the
> future.

Done.

> - I imagine that the old SIZEOF_VOID_P check is superfluous now, since
> the whole file is gated by HAVE_X86_64_POPCNTQ.

I think you're right.  There was some concern about this when I was first
adding the SSE4.2-specific pg_popcount() [1], but all the configure-time
checks for HAVE_X86_64_POPCNTQ are restricted to 64-bit x86, so I bet we
could safely assume SIZEOF_VOID_P == 8 in that file.

> - Maybe we can remove the aligned 32-bit path in
> pg_popcount_(masked_)portable(), since that's on-topic for this patch
> and would simplify things further.

IMHO that's a reasonable thing for us to do.

[0] https://postgr.es/m/20240404171828.GA3866970%40nathanxps13
[1] https://postgr.es/m/CAApHDvojPyh6dLKooqjXSZE%3D0Ed590Lq1BxF7WQ9knSggyuJEA%40mail.gmail.com

-- 
nathan

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: ABI Compliance Checker GSoC Project
Next
From: Melanie Plageman
Date:
Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)