I noticed that several of the List functions do simple linear searches that
can be optimized with SIMD intrinsics (as was done for XidInMVCCSnapshot in
37a6e5d). The following table shows the time spent iterating over a list
of n elements (via list_member_int) one billion times on my x86 laptop.
n | head (ms) | patched (ms)
------+-----------+--------------
2 | 3884 | 3001
4 | 5506 | 4092
8 | 6209 | 3026
16 | 8797 | 4458
32 | 25051 | 7032
64 | 37611 | 12763
128 | 61886 | 22770
256 | 111170 | 59885
512 | 209612 | 103378
1024 | 407462 | 189484
I've attached a work-in-progress patch that implements these optimizations
for both x86 and arm, and I will register this in the July commitfest. I'm
posting this a little early in order to gauge interest. Presumably you
shouldn't be using a List if you have many elements that must be routinely
searched, but it might be nice to speed up these functions, anyway. This
was mostly a fun weekend project, and I don't presently have any concrete
examples of workloads where this might help.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com