More inaccurate results from numeric pow() - Mailing list pgsql-hackers

From Dean Rasheed
Subject More inaccurate results from numeric pow()
Date
Msg-id CAEZATCUj3U-cQj0jjoia=qgs0SjE3auroxh8swvNKvZWUqegrg@mail.gmail.com
Whole thread Raw
Responses Re: More inaccurate results from numeric pow()
Re: More inaccurate results from numeric pow()
List pgsql-hackers
Doing some more testing of the numeric code patched in [1] I noticed
another case where the result is inaccurate -- computing 0.12 ^
-2345.6 gives a very large number containing 2162 digits, but only the
first 2006 correct, while the last 156 digits are wrong.

The reason is this code in power_var():

        /* limit to something that won't cause integer overflow */
        val = Max(val, -NUMERIC_MAX_RESULT_SCALE);
        val = Min(val, NUMERIC_MAX_RESULT_SCALE);

where "val" is the approximate decimal result weight. Here
NUMERIC_MAX_RESULT_SCALE is 2000, so it's clamping the estimated
result weight to 2000, and therefore reducing the rscale in the
subsequent calculations, causing the loss of precision at around 2000
digits.

In fact it's possible to predict exactly how large we need to allow
"val" to become, since the final result is computed using exp_var(),
which accepts inputs up to 6000, so the result weight "val" can be up
to around log10(exp(6000)) ~= 2606 before the final result causes an
overflow.

The obvious fix would be to modify the clamping limits. I think a
better answer though is to replace the clamping code with an overflow
test, immediately throwing an error if "val" is outside the allowed
range, per the attached patch.

This has the advantage that it avoids some expensive computations in
the case where the result will end up overflowing, but more
importantly it means that power_var() isn't so critically dependent on
the limits of exp_var() -- if someone in the future increased the
limits of exp_var() without touching power_var(), and power_var()
clamped to the old range, the problem would resurface. But doing an
overflow test in power_var() instead of clamping "val", it would
either compute an accurate result, or throw an overflow error early
on. There should be no possibility of it returning an inaccurate
result.

Regards,
Dean

[1] http://www.postgresql.org/message-id/CAEZATCV7w+8iB=07dJ8Q0zihXQT1semcQuTeK+4_rogC_zq5Hw@mail.gmail.com

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [BUGS] Breakage with VACUUM ANALYSE + partitions
Next
From: Tom Lane
Date:
Subject: Naming of new tsvector functions