Thread: Improve performance of pg_strtointNN functions

Improve performance of pg_strtointNN functions

From
David Rowley
Date:
Over on [1], Dean and I were discussing why both gcc and clang don't
seem to want to optimize the multiplication that we're doing in
pg_strtoint16, pg_strtoint32 and pg_strtoint64 into something more
efficient.  It seems that this is down to the overflow checks.
Removing seems to allow the compiler to better optimize the compiled
code.  See the use of LEA instead of IMUL in [2].

Instead of using the pg_mul_sNN_overflow functions, we can just
accumulate the number in an unsigned version of the type and do an
overflow check by checking if the accumulated value has gone beyond a
10th of the maximum *signed* value for the type.  We then just need to
do some final overflow checks after the accumulation is done.  What
those depend on if it's a negative or positive number.

I ran a few microbenchmarks with the attached str2int.c file and I see
about a 10-20% performance increase:

$ ./str2int -100000000 100000000
n = -100000000, e = 100000000
pg_strtoint32 in 3.207926 seconds
pg_strtoint32_v2 in 2.763062 seconds (16.100399% faster)

v2 is the updated version of the function

On Windows, the gains are generally a bit more. I think this must be
due to the lack of overflow intrinsic functions.

>str2int.exe -100000000 100000000
n = -100000000, e = 100000000
pg_strtoint32 in 9.416000 seconds
pg_strtoint32_v2 in 8.099000 seconds (16.261267% faster)

I was thinking that we should likely apply this before doing the hex
literals, which is the main focus of [1].  The reason being is so that
that patch can immediately have faster conversions by allowing the
compiler to use bit shifting instead of other means of multiplying by
a power-of-2 number. I'm hoping this removes a barrier for Peter from
the small gripe I raised on that thread about the patch having slower
than required hex, octal and binary string parsing.

David

[1] https://postgr.es/m/CAApHDvrL6_+wKgPqRHr7gH_6xy3hXM6a3QCsZ5ForurjDFfenA@mail.gmail.com
[2] https://godbolt.org/z/7YoMT63q1

Attachment

Re: Improve performance of pg_strtointNN functions

From
John Naylor
Date:

On Thu, Dec 1, 2022 at 6:42 AM David Rowley <dgrowleyml@gmail.com> wrote:
>
> I was thinking that we should likely apply this before doing the hex
> literals, which is the main focus of [1].  The reason being is so that
> that patch can immediately have faster conversions by allowing the
> compiler to use bit shifting instead of other means of multiplying by
> a power-of-2 number. I'm hoping this removes a barrier for Peter from
> the small gripe I raised on that thread about the patch having slower
> than required hex, octal and binary string parsing.

I don't see why the non-decimal literal patch needs to be "immediately" faster? If doing this first leads to less code churn, that's another consideration, but you haven't made that argument.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Improve performance of pg_strtointNN functions

From
David Rowley
Date:
On Thu, 1 Dec 2022 at 18:27, John Naylor <john.naylor@enterprisedb.com> wrote:
> I don't see why the non-decimal literal patch needs to be "immediately" faster? If doing this first leads to less
codechurn, that's another consideration, but you haven't made that argument.
 

My view is that Peter wants to keep the code he's adding for the hex,
octal and binary parsing as similar to the existing code as possible.
I very much understand Peter's point of view on that. Consistency is
good. However, if we commit the hex literals patch first, people might
ask "why don't we use bit-wise operators to make the power-of-2 bases
faster?", which seems like a very legitimate question. I asked it,
anyway...  On the other hand, if Peter adds the bit-wise operators
then the problem of code inconsistency remains.

As an alternative to those 2 options, I'm proposing we commit this
first then the above dilemma disappears completely.

If this was going to cause huge conflicts with Peter's patch then I
might think differently. I feel like it's a fairly trivial task to
rebase.

If the consensus is that we should fix this afterwards, then I'm happy to delay.

David



Re: Improve performance of pg_strtointNN functions

From
Dean Rasheed
Date:
On Thu, 1 Dec 2022 at 05:38, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Thu, 1 Dec 2022 at 18:27, John Naylor <john.naylor@enterprisedb.com> wrote:
> > I don't see why the non-decimal literal patch needs to be "immediately" faster? If doing this first leads to less
codechurn, that's another consideration, but you haven't made that argument.
 
>
> My view is that Peter wants to keep the code he's adding for the hex,
> octal and binary parsing as similar to the existing code as possible.
> I very much understand Peter's point of view on that. Consistency is
> good. However, if we commit the hex literals patch first, people might
> ask "why don't we use bit-wise operators to make the power-of-2 bases
> faster?", which seems like a very legitimate question. I asked it,
> anyway...  On the other hand, if Peter adds the bit-wise operators
> then the problem of code inconsistency remains.
>
> As an alternative to those 2 options, I'm proposing we commit this
> first then the above dilemma disappears completely.
>
> If this was going to cause huge conflicts with Peter's patch then I
> might think differently. I feel like it's a fairly trivial task to
> rebase.
>
> If the consensus is that we should fix this afterwards, then I'm happy to delay.
>

I feel like it should be done afterwards, so that any performance
gains can be measured for all bases. Otherwise, we won't really know,
or have any record of, how much faster this was for other bases, or be
able to go back and test that.

Regards,
Dean



Re: Improve performance of pg_strtointNN functions

From
Peter Eisentraut
Date:
On 01.12.22 06:38, David Rowley wrote:
> If this was going to cause huge conflicts with Peter's patch then I
> might think differently. I feel like it's a fairly trivial task to
> rebase.
> 
> If the consensus is that we should fix this afterwards, then I'm happy to delay.

If we are happy with this patch, then it's okay with me if you push this 
first.  I'll probably need to do another pass over my patch anyway, so a 
bit more work isn't a problem.




Re: Improve performance of pg_strtointNN functions

From
David Rowley
Date:
On Fri, 2 Dec 2022 at 20:35, Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
> If we are happy with this patch, then it's okay with me if you push this
> first.  I'll probably need to do another pass over my patch anyway, so a
> bit more work isn't a problem.

Thanks. I'll start looking at the patch again now. If I don't find any
problems then I'll push it.

I just did some performance tests by COPYing in 40 million INTs
ranging from -/+ 20 million.

create table ab(a int, b int);
insert into ab select x,x from generate_series(-20000000,20000000)x;
copy ab to '/tmp/ab.csv';

-- master
truncate ab; copy ab from '/tmp/ab.csv';
Time: 10219.386 ms (00:10.219)
Time: 10252.572 ms (00:10.253)
Time: 10202.940 ms (00:10.203)

-- patched
truncate ab; copy ab from '/tmp/ab.csv';
Time: 9522.020 ms (00:09.522)
Time: 9441.294 ms (00:09.441)
Time: 9432.834 ms (00:09.433)

About ~8% faster

David



Re: Improve performance of pg_strtointNN functions

From
David Rowley
Date:
On Sun, 4 Dec 2022 at 13:52, David Rowley <dgrowleyml@gmail.com> wrote:
> Thanks. I'll start looking at the patch again now. If I don't find any
> problems then I'll push it.

Pushed with some small adjustments.

David



Re: Improve performance of pg_strtointNN functions

From
Dean Rasheed
Date:
On Sun, 4 Dec 2022 at 03:19, David Rowley <dgrowleyml@gmail.com> wrote:
>
> Pushed with some small adjustments.
>

Ah, I see that you changed the overflow test, and I realise that I
forgot to answer your question about why I wrote that as 1 - INT_MIN /
10 over on the other thread.

The reason is that we need to detect whether tmp * base will exceed
-INT_MIN, not INT_MAX, since we're accumulating the absolute value of
a signed integer. So the right test is

    tmp >= 1 - INT_MIN / base

or equivalently

    tmp > -(INT_MIN / base)

I used the first form, because it didn't require extra parentheses,
but that doesn't really matter. The point is that, in general, that's
not the same as

    tmp > INT_MAX / base

though it happens to be the same for base = 10, because INT_MIN and
INT_MAX aren't divisible by 10. It will break when base is a power of
2 though, so although it's not broken now, it's morally wrong, and it
risks breaking when Peter commits his patch.

Regards,
Dean



Re: Improve performance of pg_strtointNN functions

From
David Rowley
Date:
On Sun, 4 Dec 2022 at 22:53, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> Ah, I see that you changed the overflow test, and I realise that I
> forgot to answer your question about why I wrote that as 1 - INT_MIN /
> 10 over on the other thread.
>
> The reason is that we need to detect whether tmp * base will exceed
> -INT_MIN, not INT_MAX, since we're accumulating the absolute value of
> a signed integer.

I think I'd been too focused on the simplicity of that expression and
also the base 10 part. I saw that everything worked in base 10 and
failed to give enough forward thought to other bases.

I now see that it was wrong-headed to code it the way I had it.
Thanks for pointing this out. I've just pushed a fix.

David