Re: add AVX2 support to simd.h - Mailing list pgsql-hackers

From John Naylor
Subject Re: add AVX2 support to simd.h
Date
Msg-id CANWCAZafKPUBYdNdtqZLVxVJhSn-ONeo_tp1FsODcn7udjKwRQ@mail.gmail.com
Whole thread Raw
In response to Re: add AVX2 support to simd.h  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: add AVX2 support to simd.h
List pgsql-hackers
On Tue, Mar 19, 2024 at 10:16 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:
>
> On Tue, Mar 19, 2024 at 10:03:36AM +0700, John Naylor wrote:
> > I took a brief look, and 0001 isn't quite what I had in mind. I can't
> > quite tell what it's doing with the additional branches and "goto
> > retry", but I meant something pretty simple:
>
> Do you mean 0002?  0001 just adds a 2-register loop for remaining elements
> once we've exhausted what can be processed with the 4-register loop.

Sorry, I was looking at v2 at the time.

> > - if short, do one element at a time and return
>
> 0002 does this.

That part looks fine.

> > - if long, do one block unconditionally, then round the start pointer
> > up so that "end - start" is an exact multiple of blocks, and loop over
> > them
>
> 0002 does the opposite of this.  That is, after we've completed as many
> blocks as possible, we move the iterator variable back to "end -
> block_size" and do one final iteration to cover all the remaining elements.

Sounds similar in principle, but it looks really complicated. I don't
think the additional loops and branches are a good way to go, either
for readability or for branch prediction. My sketch has one branch for
which loop to do, and then performs only one loop. Let's do the
simplest thing that could work. (I think we might need a helper
function to do the block, but the rest should be easy)



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Inconsistent printf placeholders
Next
From: jian he
Date:
Subject: Re: Catalog domain not-null constraints