Re: Optimize numeric.c mul_var() using the Karatsuba algorithm - Mailing list pgsql-hackers

From Dean Rasheed
Subject Re: Optimize numeric.c mul_var() using the Karatsuba algorithm
Date
Msg-id CAEZATCX1TZjodeVgeeeZ4AXLcvJYL3VOBqp9MJDChZP6YJ0EWg@mail.gmail.com
Whole thread Raw
In response to Re: Optimize numeric.c mul_var() using the Karatsuba algorithm  ("Joel Jacobson" <joel@compiler.org>)
List pgsql-hackers
On Sun, 30 Jun 2024 at 11:22, Joel Jacobson <joel@compiler.org> wrote:
>
> The surprising realization here is that there are actually (var1ndigits, var2ndigits)
> combinations where *only* doing mul_var_karatsuba_half() recursively
> all the way down to schoolbook *is* a performance win,
> even though we don't do any mul_var_karatsuba_full().
>
> Indeed only mul_var_karatsuba_half() will be called with the inputs:
> var1ndigits=1000 var2ndigits=10000
> var1ndigits=1000 var2ndigits=5000
> It will never call mul_var_karatsuba_full().
>
> Surprisingly, this still gives a 13% speed-up on a Intel Core i9-14900K.
>

Hmm, I don't see any gains in that example. Logically, it is doing
slightly more work splitting up var2 and adding partial products.
However, it's possible that what you're seeing is a memory access
problem where, if var2 is too large, it won't fit in cache close to
the CPU, and since the schoolbook algorithm traverses var2 multiple
times, it ends up being quicker to split up var2. Since I have a
slower CPU, I'm more likely to be CPU-limited, while you might be
memory-limited.

This makes me think that this is always going to be very
hardware-dependent, and we shouldn't presume what will work best on
the user's hardware.

However, if a 1000x1000 ndigit product is known to be faster using
Karatsuba on some particular hardware (possibly nearly all hardware),
then why wouldn't it make sense to do the above as 10 invocations of
Karatsuba?

Regards,
Dean



pgsql-hackers by date:

Previous
From: Jelte Fennema-Nio
Date:
Subject: Re: [PATCH] Handle SK_SEARCHNULL and SK_SEARCHNOTNULL in HeapKeyTest
Next
From: Erik Wienhold
Date:
Subject: Re: Underscore in positional parameters?