On Fri, Aug 5, 2022 at 5:15 AM Nathan Bossart <
nathandbossart@gmail.com> wrote:
>
> On Thu, Aug 04, 2022 at 02:58:14PM +0700, John Naylor wrote:
> > Were you considering adding the new function to simd.h now that that's
> > committed? It's a bit up in the air what should go in there, but this new
> > function is low-level and generic enough to be a candidate...
>
> I don't have a strong opinion. I went with a separate file because I
> envisioned a variety of possible linear search functions (e.g., char,
> uint16, uint32), and some might not use SIMD instructions. Futhermore, it
> seemed less obvious to look in simd.h for linear search functions.
That is a good point. Maybe potential helpers in simd.h should only deal specifically with vector registers, with it's users providing C fallbacks. I don't have any good ideas of where to put the new function, though.
> > I wonder if the "pg_" prefix is appropriate here, as that is most often
> > used for things that hide specific details *and* where the base name would
> > clash, like OS calls or C library functions. I'm not quite sure where the
> > line is drawn, but I mean that "linearsearch" is a generic algorithm and
> > not a specific API we are implementing, if that makes sense.
>
> Yeah, I was concerned about clashing with lsearch() and lfind(). I will
> drop the prefix.
Hmm, I didn't know about those. lfind() is similar enough that it would make sense to have pg_lfind32() etc in src/include/port/pg_lsearch.h, at least for the v4 version that returns the pointer. We already declare bsearch_arg() in src/include/port.h and that's another kind of array search. Returning bool is different enough to have a different name. pg_lfind32_ispresent()? *_noptr()? Meh.
Having said all that, the man page under BUGS [1] says "The naming is unfortunate."
> > Out of curiosity do we know how much we get by loading four registers
> > rather than two?
>
> The small program I've been using for testing takes about 40% more time
> with the two register approach. The majority of this test involves
> searching for elements that either don't exist in the array or that live
> near the end of the array, so this is probably close to the worst case.
Ok, sounds good.
[1]
https://man7.org/linux/man-pages/man3/lsearch.3.html#BUGS--
John Naylor
EDB:
http://www.enterprisedb.com