Re: [PATCH] Fix old thinko in formula to compute sweight in numeric_sqrt(). - Mailing list pgsql-hackers

From Dean Rasheed
Subject Re: [PATCH] Fix old thinko in formula to compute sweight in numeric_sqrt().
Date
Msg-id CAEZATCXo0PL_WF1Wsk++p_2TYWte6bftNSvrQ1VyVff+u55S3w@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Fix old thinko in formula to compute sweight in numeric_sqrt().  ("Joel Jacobson" <joel@compiler.org>)
Responses Re: [PATCH] Fix old thinko in formula to compute sweight in numeric_sqrt().
List pgsql-hackers
On Tue, 31 Jan 2023 at 08:00, Joel Jacobson <joel@compiler.org> wrote:
>
> I think this is what we want:
>
>         if (arg.weight < 0)
>                 sweight = (arg.weight + 1) * DEC_DIGITS / 2 - 1;
>         else
>                 sweight = arg.weight * DEC_DIGITS / 2 + 1;
>

That's still not right. If you want a proper mathematically justified
formula, it's fairly easy to derive.

Let "n" be the decimal weight of the input, taken to be the number of
decimal digits before the decimal point (or minus the number of zeros
after the decimal point, for inputs with no digits before the decimal
point).

Similarly, let "sweight" be the decimal weight of the square root.
Then the relationship between sweight and n can be seen from a few
simple examples (to 4 significant digits):

n     arg                    sqrt(arg)             sweight
-3    0.0001 .. 0.0009999    0.01 .. 0.03162       -1
-2    0.001 .. 0.009999      0.03162 .. 0.09999    -1
-1    0.01 .. 0.09999        0.1 .. 0.3162         0
0     0.1 .. 0.9999          0.3162 .. 0.9999      0
1     1 .. 9.999             1 .. 3.162            1
2     10 .. 99.99            3.16 .. 9.999         1
3     100 .. 999.9           10 .. 31.62           2
4     1000 ... 9999          31.62 .. 99.99        2

and the general formula is:

    sweight = floor((n+1) / 2)

In our case, since the base is NBASE, not 10, and since we only
require an approximation, we don't take the trouble to compute n
exactly, we just use the fact that it lies in the range

    arg.weight * DEC_DIGITS + 1 <= n <= (arg.weight + 1) * DEC_DIGITS

Since we want to ensure at least a certain number of significant
digits in the result, we're only interested in the lower bound.
Plugging that into the formula above gives:

    sweight >= floor(arg.weight * DEC_DIGITS / 2 + 1)

or equivalently, in code with truncated integer division:

    if (arg.weight >= 0)
        sweight = arg.weight * DEC_DIGITS / 2 + 1;
    else
        sweight = 1 - (1 - arg.weight * DEC_DIGITS) / 2;

This is not the same as your formula. For example, when DEC_DIGITS = 1
and arg.weight = -1, yours gives sweight = -1 which isn't right, it
should be 0.

When DEC_DIGITS = 4, this formula also reduces to sweight = 2 *
arg.weight + 1, but neither gcc nor clang is smart enough to spot that
(clang doesn't simplify your formula either, BTW). So even though I
believe that the above is mathematically correct, and won't change any
results for DEC_DIGITS = 4, I'm still hesitant to use it, because it
will have a (small) performance impact, and I don't believe it does
anything to improve code readability (and certainly not without an
explanatory comment).

When DEC_DIGITS = 1, it does guarantee that the result has exactly 16
significant digits (or more if the input scale is larger), but that's
only really of theoretical interest to anyone.

As I noted above, when DEC_DIGITS > 1, this formula is only an
approximation, since it's not using the exact input decimal weight. So
my inclination is to leave the code as-is. It does guarantee that the
result has at least 16 significant digits, which is the intention.

Regards,
Dean



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Transparent column encryption
Next
From: Robert Haas
Date:
Subject: Re: HOT chain validation in verify_heapam()