Re: Optimize numeric.c mul_var() using the Karatsuba algorithm - Mailing list pgsql-hackers

From Joel Jacobson
Subject Re: Optimize numeric.c mul_var() using the Karatsuba algorithm
Date
Msg-id 56c8f87c-0ab1-49dd-8445-94e10fde5f31@app.fastmail.com
Whole thread Raw
In response to Re: Optimize numeric.c mul_var() using the Karatsuba algorithm  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sun, Jun 30, 2024, at 17:44, Tom Lane wrote:
> "Joel Jacobson" <joel@compiler.org> writes:
>> On Sat, Jun 29, 2024, at 17:25, Tom Lane wrote:
>>> (In general I find this patch seriously undercommented.)
>
>> However, I think the comments above split_var_at(),
>> mul_var_karatsuba_full() and mul_var_karatsuba_half()
>> are quite good already, what do you think?
>
> Not remarkably so.  For starters, heaven help the reader who has
> no idea what "the Karatsuba algorithm" refers to.  Nor is there
> any mention of why (or when) it's better than the traditional
> algorithm.  You could at least do people the courtesy of providing
> a link to the wikipedia article that you're assuming they've
> memorized.

Thanks for guidance. New patch attached.

I've added as an introduction to Karatsuba:

+/*
+ * mul_var_karatsuba_full() -
+ *
+ *    Multiplication using the Karatsuba algorithm.
+ *
+ *    The Karatsuba algorithm is a divide-and-conquer algorithm that reduces
+ *    the complexity of large number multiplication. It splits each number
+ *    into two halves and performs three multiplications on the parts,
+ *    rather than four as in the traditional method. This results in
+ *    a significant performance improvement for sufficiently large numbers.
...
+ *    For more details on the Karatsuba algorithm, see the Wikipedia article:
+ *    https://en.wikipedia.org/wiki/Karatsuba_algorithm

> There's also a discussion to be had about why Karatsuba is
> a better choice than other divide-and-conquer multiplication
> methods.  Why not Toom-Cook, for example, which the aforesaid
> wikipedia page says is faster yet?  I suppose you concluded
> that the extra complexity is unwarranted, but this is the
> sort of thing I'd expect to see explained in the comments.

I've added this to the end of the comment on mul_var_karatsuba_full():

+ *    The Karatsuba algorithm is preferred over other divide-and-conquer methods
+ *    like Toom-Cook for this implementation due to its balance of complexity and
+ *    performance gains given Numeric's constraints.
+ *
+ *    For Toom-Cook to be worth the added complexity, the factors would need to
+ *    be much larger than supported by Numeric, making Karatsuba a more
+ *    appropriate choice.

Also, I added this comment on the #define's at the beginning:

+/*
+ * Constants used to determine when the Karatsuba algorithm should be used
+ * for multiplication. These thresholds were determined empirically through
+ * benchmarking across various architectures, aiming to avoid performance
+ * regressions while capturing potential gains. The choice of these values
+ * involves trade-offs and balances simplicity and performance.
+ */

As well as this comment on KARATSUBA_CONDITION:

+/*
+ * KARATSUBA_CONDITION() -
+ *
+ * This macro determines if the Karatsuba algorithm should be applied
+ * based on the number of digits in the multiplicands. It checks if
+ * the number of digits in the larger multiplicand exceeds a base limit
+ * and if the sizes of the multiplicands fall within specific ranges
+ * where Karatsuba multiplication is usually beneficial.
+ *
+ * The conditions encapsulated by KARATSUBA_CONDITION are:
+ * 1. The larger multiplicand has more digits than the base limit.
+ * 2. The sizes of the multiplicands fall within low, middle, or high range
+ *    conditions which were identified as performance beneficial regions during
+ *    benchmarks.
+ *
+ * The macro ensures that the algorithm is applied only when it is likely
+ * to provide performance benefits, considering the size and ratio of the
+ * factors.
+ */

What do you think?

Regards,
Joel
Attachment

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible
Next
From: Tom Lane
Date:
Subject: Re: speed up a logical replica setup