On Mon, Nov 06, 2023 at 07:15:01PM -0800, Noah Misch wrote:
> On Mon, Nov 06, 2023 at 09:52:58PM -0500, Tom Lane wrote:
>> Nathan Bossart <nathandbossart@gmail.com> writes:
>> > Like I said, I don't have any proposals yet, but assuming we do want to
>> > support newer intrinsics, either open-coded or via auto-vectorization, I
>> > suspect we'll need to gather consensus for a new policy/strategy.
>>
>> Yeah. The function-pointer solution kind of sucks, because for the
>> sort of operation we're considering here, adding a call and return
>> is probably order-of-100% overhead. Worse, it adds similar overhead
>> for everyone who doesn't get the benefit of the optimization.
>
> The glibc/gcc "ifunc" mechanism was designed to solve this problem of choosing
> a function implementation based on the runtime CPU, without incurring function
> pointer overhead. I would not attempt to use AVX512 on non-glibc systems, and
> I would use ifunc to select the desired popcount implementation on glibc:
> https://gcc.gnu.org/onlinedocs/gcc-4.8.5/gcc/Function-Attributes.html
Thanks, that seems promising for the function pointer cases. I'll plan on
trying to convert one of the existing ones to use it. BTW it looks like
LLVM has something similar [0].
IIUC this unfortunately wouldn't help for cases where we wanted to keep
stuff inlined, such as is_valid_ascii() and the functions in pg_lfind.h,
unless we applied it to the calling functions, but that doesn't ѕound
particularly maintainable.
[0] https://llvm.org/docs/LangRef.html#ifuncs
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com