On Tue, Jul 26, 2022 at 11:19:06AM -0700, Andres Freund wrote:
> On 2022-07-25 12:04:19 -0700, Nathan Bossart wrote:
>> From the discussion thus far, it seems there is interest in optimizing
>> [sub]xip lookups, so I'd like to spend some time moving it forward. I
>> think the biggest open question is which approach to take. Both the SIMD
>> and hash table approaches seem viable, but I think I prefer the SIMD
>> approach at the moment (see the last paragraph of quoted text for the
>> reasons). What do folks think?
>
> Agreed on all points.
Great! Here is a new patch. A couple notes:
* I briefly looked into seeing whether auto-vectorization was viable and
concluded it was not for these loops.
* I borrowed USE_SSE2 from one of John Naylor's patches [0]. I'm not sure
whether this is committable, so I would welcome thoughts on the proper
form. Given the comment says that SSE2 is supported by all x86-64
hardware, I'm not seeing why we need the SSE 4.2 checks. Is it not
enough to check for __x86_64__ and _M_AMD64?
* I haven't looked into adding an ARM implementation yet.
[0] https://postgr.es/m/CAFBsxsHko7yc8A-2PpjQ%3D2StomXF%2BT2jgKF%3DWaMFZWi8CvV7hA%40mail.gmail.com
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com