On Tue, Jul 9, 2024, at 14:01, Dean Rasheed wrote:
> One thing I noticed while testing the earlier patches on this thread
> was that they were significantly faster if they used unsigned integers
> rather than signed integers. I think the reason is that operations
> like "x / 10000" and "x % 10000" use fewer CPU instructions (on every
> platform, according to godbolt.org) if x is unsigned.
>
> In addition, this reduces the number of times the digit array needs to
> be renormalised, which seems to be the biggest factor.
>
> Another small optimisation that seems to be just about worthwhile is
> to pull the first digit of var1 out of the main loop, so that its
> contributions can be set directly in dig[], rather than being added to
> it. This allows palloc() to be used to allocate dig[], rather than
> palloc0(), and only requires the upper part of dig[] to be initialised
> to zeros, rather than all of it.
>
> Together, these seem to give a decent speed-up:
..
> Attachments:
> * optimise-mul_var.patch
I've reviewed the patch now.
Code is straightforward, and comments easy to understand.
LGTM.
Regards,
Joel