On Fri, Jul 29, 2022 at 4:34 AM Nathan Bossart <
nathandbossart@gmail.com> wrote:
> * I briefly looked into seeing whether auto-vectorization was viable and
> concluded it was not for these loops.
>
> * I borrowed USE_SSE2 from one of John Naylor's patches [0]. I'm not sure
> whether this is committable,
I'll be the first to say it's not committable and needs some thought. Since there are several recently proposed patches that take advantage of SSE2, it seems time for me to open a new thread and get that prerequisite settled. I'll do that next week.
> so I would welcome thoughts on the proper
> form. Given the comment says that SSE2 is supported by all x86-64
> hardware, I'm not seeing why we need the SSE 4.2 checks. Is it not
> enough to check for __x86_64__ and _M_AMD64?
That's enough for emitting instructions that the target CPU can run, but says nothing (I think) about the host compiler's ability to understand the intrinsics and associated headers. The architecture is old enough that maybe zero compilers in the buildfarm that target AMD64 fail to understand SSE2 intrinsics, but I hadn't looked into it. The SSE 4.2 intrinsics check is not necessary, but it was sufficient and already present, so I borrowed it for the PoC.
--
John Naylor
EDB:
http://www.enterprisedb.com