Re: Optimize numeric.c mul_var() using the Karatsuba algorithm - Mailing list pgsql-hackers
From | Joel Jacobson |
---|---|
Subject | Re: Optimize numeric.c mul_var() using the Karatsuba algorithm |
Date | |
Msg-id | 56c8f87c-0ab1-49dd-8445-94e10fde5f31@app.fastmail.com Whole thread Raw |
In response to | Re: Optimize numeric.c mul_var() using the Karatsuba algorithm (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
On Sun, Jun 30, 2024, at 17:44, Tom Lane wrote: > "Joel Jacobson" <joel@compiler.org> writes: >> On Sat, Jun 29, 2024, at 17:25, Tom Lane wrote: >>> (In general I find this patch seriously undercommented.) > >> However, I think the comments above split_var_at(), >> mul_var_karatsuba_full() and mul_var_karatsuba_half() >> are quite good already, what do you think? > > Not remarkably so. For starters, heaven help the reader who has > no idea what "the Karatsuba algorithm" refers to. Nor is there > any mention of why (or when) it's better than the traditional > algorithm. You could at least do people the courtesy of providing > a link to the wikipedia article that you're assuming they've > memorized. Thanks for guidance. New patch attached. I've added as an introduction to Karatsuba: +/* + * mul_var_karatsuba_full() - + * + * Multiplication using the Karatsuba algorithm. + * + * The Karatsuba algorithm is a divide-and-conquer algorithm that reduces + * the complexity of large number multiplication. It splits each number + * into two halves and performs three multiplications on the parts, + * rather than four as in the traditional method. This results in + * a significant performance improvement for sufficiently large numbers. ... + * For more details on the Karatsuba algorithm, see the Wikipedia article: + * https://en.wikipedia.org/wiki/Karatsuba_algorithm > There's also a discussion to be had about why Karatsuba is > a better choice than other divide-and-conquer multiplication > methods. Why not Toom-Cook, for example, which the aforesaid > wikipedia page says is faster yet? I suppose you concluded > that the extra complexity is unwarranted, but this is the > sort of thing I'd expect to see explained in the comments. I've added this to the end of the comment on mul_var_karatsuba_full(): + * The Karatsuba algorithm is preferred over other divide-and-conquer methods + * like Toom-Cook for this implementation due to its balance of complexity and + * performance gains given Numeric's constraints. + * + * For Toom-Cook to be worth the added complexity, the factors would need to + * be much larger than supported by Numeric, making Karatsuba a more + * appropriate choice. Also, I added this comment on the #define's at the beginning: +/* + * Constants used to determine when the Karatsuba algorithm should be used + * for multiplication. These thresholds were determined empirically through + * benchmarking across various architectures, aiming to avoid performance + * regressions while capturing potential gains. The choice of these values + * involves trade-offs and balances simplicity and performance. + */ As well as this comment on KARATSUBA_CONDITION: +/* + * KARATSUBA_CONDITION() - + * + * This macro determines if the Karatsuba algorithm should be applied + * based on the number of digits in the multiplicands. It checks if + * the number of digits in the larger multiplicand exceeds a base limit + * and if the sizes of the multiplicands fall within specific ranges + * where Karatsuba multiplication is usually beneficial. + * + * The conditions encapsulated by KARATSUBA_CONDITION are: + * 1. The larger multiplicand has more digits than the base limit. + * 2. The sizes of the multiplicands fall within low, middle, or high range + * conditions which were identified as performance beneficial regions during + * benchmarks. + * + * The macro ensures that the algorithm is applied only when it is likely + * to provide performance benefits, considering the size and ratio of the + * factors. + */ What do you think? Regards, Joel
Attachment
pgsql-hackers by date: