Thread: BUG #16281: LN() function inaccurate at 1000th fractional digit

BUG #16281: LN() function inaccurate at 1000th fractional digit

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      16281
Logged by:          Justin
Email address:      anyhowstep@hotmail.com
PostgreSQL version: 12.2
Operating system:   Ubuntu 18.04.2
Description:

Given this,

```
SELECT

LN(484990182159328900690402236933516249572671683638747490717351807610531884491845416923860371219625151551889257298200816555016472471293780254009492949585031653913930735918829139712249577547959394351523545901788627247613322896296041868431769047433229466634098452564756860190085118463828382895145244362033728480588969626012192733802377468089120757046364393407262957242230928854711898925295251902007136232994524624903257456111389508582206404271734668422903183500589303866613158037169610592539145461637447957948521714058034772237111009429638870236361143304703683377693378577075353794118557951847394763531830696578809001981568860219578880229402696449243344235099860421846016326538272155937175661905904288335499593232232926636205909086901191153907183842087577811871344870731324067822883041265129394268082883745408414994.8967939438561591657171240282983703914075472645212002662497023142663831371447287624846942598424990784971781730103682951722370983277124599054059027055336437808366784501932987082321905202623642371063626378290734289114618092750984153422293450048717769065428713836637664433167768445609659527458911187829232316677137895259433038764404970599325009178297626038331436654541552998098529141205301472138026818453893127265938030066392881979113522757891639646670670272542401773230506961559808927249585675430838495658225557294666522469887436551840596777627408780618586500922973500018513068499587683746133637919751545157547095670767246977244726331271787622126889459658539988980096764323712767863722912919120929339399753431689512753214200090670880647731689804555417871258907716687575767185444541243606329768784843125926070743277339790277626515824924290352180761378846035233155198504033292692893297993698953705472933411199778880561376633444249703838589180474329586470353212010427945060694794274109764269805332803290229)
```

Pg's result,


1864.3702986939570026328504202935192533137907736189919154633800554877738455118081651650863235106905871352085850240570561347180517240105510505203972860921397909573687877993477806728098306202020229409548306695695574102950949468160529713610952021974630774784174851619325758380143625473386495586347322798415543385655090746985183329114860118551572428921774322172798724455202876781611633419444058398798142214904998877857425038669920064728855823072107227506485770367799671977282350083029452784747350395161797215115525867416898416360638482342253129160308632504217096916335590470843180746834864303790913372081974355613359678634194879425862536147988835528973291020680020540866655622823550861337486588647231688134992810403147262346312159819432914207194632564009749236609081399504118359354620598232725290537215007867979331582119891661859015726276335168158288396939655310210558566592649049602925182137256134162660116182293851038854455437841571331011002023088829768308520393956515509475418031437505751407687618234418263

The last digit should be a 2, but pg thinks it's a 3,

If pg supported more than 1000 decimal places, the sequence should continue
...823441826 2 4968...

The 1000th fractional digit is 2.
The 1001st fractional digit is 4.
It should round the 1000th fractional digit to 2.

I'm not sure how much people care about results at 1000 decimal places,
though.
I'm not affected by this minor rounding error, I only discovered this while
working on something else.

-----

Other values that produce behaviour,


87190145885430429849953615409019208993240447426362428988181639909267773304254748257120061524000254226856815085523676417146197197996896030672521334101413071112068202429835905642444187493717977611730127126387257253646790849384975208460867137315507010888782632024640844766297185244443116696943912406389670302370461137850160539373600494054874979342373255280815156048999900951842673141766630630919020492255966628630634124452614590400422133958133100159154995520080124736657520969784129924799670552560034302960877087853678350801769339861812435411200669026902417951572668727488315537985378304242438181615160041688723201917323705450185975141141262578884689500612295576288125956289035673242989906973367691922065122033180281670221390667818909912035903387888639331486823729897326624516015340.0330856710565117793999512551468220085711713631167607285185762046751452975325645379302403715842570486302993296501788672462090620871511446272026693318239212657949496275318383141403236705902077406660768573015707706831878445598837931116223956945944726162551477136715847593742032488181481888084716920605114101902724395659898621880016853548602514706686907951229872573180602614761229992106144727082722940736406782659562775289407005631298246624198606031298081220736931229256511054595028182057216042683060059115371651410352645266000330509331097811566633211452233019461903115970558624057877018778178814946285827512359903934291318219271464841957435711594154280905473802599888081783098187210283997106131616471807951265003903143099667366508222327805543948921694362089860577380749774036318574113007382111997454202845559941557812813566442364810680529092880773126707073967537693927177460459341763934709686530005721141046645111784404932103241501569571235364365556796422998363930810983452790309019295181282099408260156


93936642222690597390233191619858485419795942047468396309991947772747208870873993801669373075421461116465960407843923269693395211616591453397070258466704654943689268224479477016161636938138334729982904232438440955361656138189836032891825113139184685132178764873033678116450665758561650355252211196676137179184043639278410827092182700922151290703747496962700158844772453483316974221113826173404445159281421213715669245417896170368554410830320000019029956317336703559699859949692222685614036912057150632902650913831404804982509990655560731349634628713944739168096272097122388116038119844786988276635032016787352796502360718569977397214936366251320294621522016.6483354941025384161536675750898007896744690911429670830432784905421638721478353275821072200938900938046264210604940707974410950770029535636602548377806284157951164875821446035013896786653932045182167021839184824627082391478016195098055107001433336586881395912782883663046617432598969149948351689103230162742769845955320418573803127107923535948653168889411316007796459064267436246637115946581149511513369842911210359447262641996566147462977170742544980481275049898092152042927981394239266559286915303786701737610786594006685748456635797125029722684151298695274097006242412384086302106763844070230264910503179385988626477852818174114043927841085089058972074427820150462261941575665882880501074676800316585217150509780489224388148722603385921057007086785238310735038314861960410473809826927329368597558806004392175746233568789445929554890241140656324160187253042639339549705859147930476532359840809944163908006480881926041259363654863689570520534301207043189181147254153307163555433328278834311658232337


Re: BUG #16281: LN() function inaccurate at 1000th fractional digit

From
Dean Rasheed
Date:
On Fri, 28 Feb 2020 at 05:43, PG Bug reporting form
<noreply@postgresql.org> wrote:
>
> Given this, ...
>
> The last digit should be a 2, but pg thinks it's a 3,
>

Yeah, that's not at all surprising. All the transcendental numeric
functions are basically +/-1 in the final digit. Guaranteeing the
correct answer in the final digit is a hard problem, that involves
tracking errors accurately as the computation proceeds, and then
possible re-doing the whole thing if the final digit of the result is
on that rounding boundary that makes it impossible to determine which
way to round without using a higher internal precision.

I don't think anyone has the appetite to put in the effort to do that,
and even if they did, the patch might still be rejected if it hurt
performance too much -- it seems inevitable that there would be some
performance hit, but I don't know how much it'd be.

Regards,
Dean



Re: BUG #16281: LN() function inaccurate at 1000th fractional digit

From
Tom Lane
Date:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> On Fri, 28 Feb 2020 at 05:43, PG Bug reporting form
> <noreply@postgresql.org> wrote:
>> The last digit should be a 2, but pg thinks it's a 3,

> Yeah, that's not at all surprising. All the transcendental numeric
> functions are basically +/-1 in the final digit.

TBH, I'd be ecstatic if I thought that was the maximum possible error ;-).
I don't think anyone's really done any error propagation analysis on
the numeric transcendentals.

            regards, tom lane



Re: BUG #16281: LN() function inaccurate at 1000th fractional digit

From
Dean Rasheed
Date:
On Fri, 28 Feb 2020 at 15:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
>
> > Yeah, that's not at all surprising. All the transcendental numeric
> > functions are basically +/-1 in the final digit.
>
> TBH, I'd be ecstatic if I thought that was the maximum possible error ;-).
> I don't think anyone's really done any error propagation analysis on
> the numeric transcendentals.
>

There's a good paper [1] on that subject by the authors of MPFR. A
prerequisite for keeping errors under control is having appropriately
chosen rscales throughout, so that errors grow in a controlled way
with each operation, which [2] went a long way towards ensuring.
Barring oversights in [2], I think it's entirely plausible that the we
do have that kind of accuracy, given the fairly generous local rscales
chosen in these functions, compared with the number of internal
operations they perform (e.g., I think that ln_var() never needs more
than around 400 terms in its Taylor series), but it would take a much
more thorough analysis to prove it. Speaking of oversights though...

Looking more closely at ln_var(), it seems that there was an oversight
related to the way that it computes the local_rscale for the Taylor
series expansion --- it fails to account for the fact that the result
is multiplied by fact (2^(nsqrt+1), where nsqrt is the number of
square roots performed in the range reduction phase, which in practice
is at most 22).

Since 2^22 has 7 decimal digits, multiplying by 2^22 almost entirely
wipes out the 8-digit safety margin used in the Taylor series
expansion. The attached patch corrects that.

With this patch, all the examples originally posted return the correct
results (calculated with bc). I'd be interested to know how the OP
constructed these examples, and whether there were any that were off
by more than 1 ULP.

Regards,
Dean

[1] https://www.mpfr.org/algorithms.pdf
[2] https://www.postgresql.org/message-id/E1Zxgva-0000tn-Ox%40gemulon.postgresql.org

Attachment

Re: BUG #16281: LN() function inaccurate at 1000th fractional digit

From
Tom Lane
Date:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> Looking more closely at ln_var(), it seems that there was an oversight
> related to the way that it computes the local_rscale for the Taylor
> series expansion --- it fails to account for the fact that the result
> is multiplied by fact (2^(nsqrt+1), where nsqrt is the number of
> square roots performed in the range reduction phase, which in practice
> is at most 22).
> Since 2^22 has 7 decimal digits, multiplying by 2^22 almost entirely
> wipes out the 8-digit safety margin used in the Taylor series
> expansion. The attached patch corrects that.

Looks sane to me.

> With this patch, all the examples originally posted return the correct
> results (calculated with bc). I'd be interested to know how the OP
> constructed these examples, and whether there were any that were off
> by more than 1 ULP.

Yeah, that would be interesting.

            regards, tom lane



Re: BUG #16281: LN() function inaccurate at 1000th fractional digit

From
Justin AnyhowStep
Date:

> With this patch, all the examples originally posted return the correct
> results (calculated with bc). I'd be interested to know how the OP
> constructed these examples, and whether there were any that were off
> by more than 1 ULP.

> Yeah, that would be interesting.

I didn't do anything fancy. I'm learning how to write an arbitrary precision math library.
I couldn't come up with good test cases since so many numbers exist.
So, I sanity-checked my code by testing many random inputs against another library and pg, and compared the results.

Most of the time, the problem would be with my code.
But I found a few cases where it looked like pg was off.

I don't think I found any that were off by more than 1.

Re: BUG #16281: LN() function inaccurate at 1000th fractional digit

From
Dean Rasheed
Date:
On Sun, 1 Mar 2020 at 02:59, Justin AnyhowStep <anyhowstep@hotmail.com> wrote:
>
> > With this patch, all the examples originally posted return the correct
> > results (calculated with bc). I'd be interested to know how the OP
> > constructed these examples, and whether there were any that were off
> > by more than 1 ULP.
>
> > Yeah, that would be interesting.
>
> I didn't do anything fancy. I'm learning how to write an arbitrary precision math library.
> I couldn't come up with good test cases since so many numbers exist.
> So, I sanity-checked my code by testing many random inputs against another library and pg, and compared the results.
>
> Most of the time, the problem would be with my code.
> But I found a few cases where it looked like pg was off.
>
> I don't think I found any that were off by more than 1.
>

OK, thanks for looking, and thanks for the report.

I did some random testing of my own using 100 random numbers each with
100,000 digits to the left of the decimal point and 900 to the right,
and I was able to find one case where the error was 2 ULP, without the
patch. The overall results were as follows:

 error | count
-------+-------
     2 |     1
     1 |    16
    -1 |    20
     0 |    63
(4 rows)

which is quite a high probability of being off by one. With the patch,
the results for the same 100 random numbers became:

 error | count
-------+-------
     1 |     1
    -1 |     1
     0 |    98
(3 rows)

So the off-by-two case was fixed (actually becoming zero), and the
probability of off-by-one errors was greatly reduced (although, as
expected, not entirely removed).

I think this is pretty much the worst case for our ln()
implementation, since it has both a large number of digits before the
decimal point, causing a lot of square roots, and a large number after
the decimal point, causing a large number of Taylor series terms to be
computed. Such cases are really only of academic interest, and for
most other cases, the probability of an off-by-one result should be
much lower.

Regards,
Dean