Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands. - Mailing list pgsql-hackers

From Dean Rasheed
Subject Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.
Date
Msg-id CAEZATCUnn2JK4gSrymvmkhy_qsGybKv0-n0CTnLcfwfVNfTE7g@mail.gmail.com
Whole thread Raw
In response to Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.  ("Joel Jacobson" <joel@compiler.org>)
Responses Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.
List pgsql-hackers
On Wed, 3 Jul 2024 at 14:49, Joel Jacobson <joel@compiler.org> wrote:
>
> Hmm, I don't see how the case 2 code can be correct?
> If, like you say, res_ndigits can't be less than 3, that means it can be 3, right?
> And if res_ndigits=3 then `var2digits[res_ndigits - 4]` would try to access `var2digits[-1]`.
>

Ah yes, I think I was looking at a newer version of the code where I'd
already fixed that bug. Unless you think there are still bugs in any
of the boundary checks, which is entirely possible.

> I've tested both patches, and they produces the same output given the
> same input as HEAD, when rscale is unmodified (full precision).
>
> However, for a reduced rscale, there are some differences:
>
> mul_var_small() seems more resilient to rscale reductions than mul_var_int().
>

Ah, I can see what's going on. It's perhaps best illustrated with a
simple example. Suppose you are multiplying a 4-digit integer x by a
2-digit integer y (both with dscale=0). Then the terms of the full
product computed by mul_var() or mul_var_small() would look something
like this:

            x0     x1     x2     x3
                        * y0     y1
  ---------------------------------
  x0*y0  x1*y0  x2*y0  x3*y0
         x0*y1  x1*y1  x2*y1  x3*y1

In the reduced-rscale case, it might perform a truncated computation,
computing just the first 3 columns (say), and discarding the last two
columns. Therefore it would skip the 3 rightmost digit products.

However, in mul_var_int(), y0 and y1 have been combined into a single
integer equal to y0*NBASE+y1, and the terms of full product are
computed as follows:

  x0*(y0*NBASE+y1)  x1*(y0*NBASE+y1)  x2*(y0*NBASE+y1)  x3*(y0*NBASE+y1)

In the full product, that gives the same result, but if you follow the
same rule in the reduced-rscale case, skipping the last two terms, it
would actually discard 4 digit products, making it less accurate.

That could be avoided by increasing maxdigits by 1 in mul_var_int() so
it would always be at least as accurate as it was before, but that
might not really be necessary. However, if we implemented
mul_var_int64() in the same way, it would be discarding much
higher-order digit products, and so we probably would have to increase
maxdigits to get sufficiently accurate results. But there's an even
bigger problem: the results would be different between platforms that
did and didn't have 128-bit integers, which I really don't like. We
could avoid that by not using it in reduced-rscale cases, but that
would involve another test condition.

By contrast, mul_var_small() is intended to replicate the arithmetic
in mul_var() exactly (but in a different order) for all rscales. So if
you've found any cases where they give different results, that's a
bug.

In light of that, it might be that mul_var_small() is the better
option, rather than mul_var_int(), but it'd be interesting to see how
they compare in terms of performance first.

> What can be said about mul_var()'s contract with regards to rscale?
> It's the number of decimal digits requested by the caller, and if not
> requesting full precision, then the decimal digits might not be accurate,
> but can something be said about how far off they can be?
>

I wouldn't expect it to ever be off by more than 1, given that
MUL_GUARD_DIGITS = 2, which corresponds to 8 decimal digits, and the
number of digits in the smaller input (and hence the number of digit
products in each column) is limited to something like 16,000 NBASE
digits.

Regards,
Dean



pgsql-hackers by date:

Previous
From: Harjyot Bagga
Date:
Subject: Grammar guidelines in Postgres
Next
From: "Joel Jacobson"
Date:
Subject: Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.