Re: pgsql: Fix numeric_mul() overflow due to too many digits after decimal - Mailing list pgsql-hackers

From Dean Rasheed
Subject Re: pgsql: Fix numeric_mul() overflow due to too many digits after decimal
Date
Msg-id CAEZATCXAHZBr8hbTAyCdtAmuSHh5CporSn-mDGad34T=xn-JXQ@mail.gmail.com
Whole thread Raw
In response to Re: pgsql: Fix numeric_mul() overflow due to too many digits after decimal  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, 10 Jul 2021 at 18:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> [ moving to pghackers for wider visibility ]
>
> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> > On Sat, 10 Jul 2021 at 16:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> In general, I'm disturbed that we just threw away the previous
> >> promise that numeric multiplication results were exact.  That
> >> seems like a pretty fundamental property --- which is stated
> >> in so many words in the manual, btw --- and I'm not sure I want
> >> to give it up.
>
> > Perhaps we should amend the statement about numeric multiplication to
> > say that it's exact within the limits of the numeric type's supported
> > scale, which we also document in the manual as 16383.
> > That seems a lot better than throwing an overflow error for a result
> > that isn't very big, which limits what's possible with numeric
> > multiplication to much less than 16383 digits.
>
> TBH, I don't agree.  I think this is strictly worse than what we
> did before, and we should just revert it.  It's no longer possible
> to reason about what numeric multiplication will do.  I think
> throwing an error if we can't represent the result exactly is a
> preferable behavior.  If you don't want exact results, use float8.

Actually, I think it makes it a lot easier to reason about what it
will do -- it'll return the exact result, or the exact result
correctly rounded to 16383 digits.

That seems perfectly reasonable to me, since almost every numeric
operation ends up having to round at some point, and almost all of
them don't produce exact results.

The previous behaviour might seem like a principled stance to take,
from a particular perspective, but it's pretty hard to work with in
practice.

For example, say you had 2 numbers, each with 10000 digits after the
decimal point, that you wanted to multiply. If it throws an overflow
error for results with more than 16383 digits after the decimal point,
then you'd have to round one or both input numbers before multiplying.
To get the most accurate result possible, you'd actually have to round
each number to have the same number of *significant digits*, rather
than the same number of digits after the decimal point. In general,
that's quite complicated -- suppose, after rounding, you had

  x = [x1 digits] . [x2 digits]
  y = [y1 digits] . [y2 digits]

where x1+x2 = y1+y2 to minimise the error in the final result, and
x2+y2 = 16383 to maximise the result scale. x1 and y1 are known
(though we don't have a convenient way to obtain them), so that's a
pair of simultaneous equations to solve to decide the optimal amount
of rounding to apply before multiplying. And after all that, the
result will only be accurate to around x1+x2 (or y1+y2) significant
digits, and around x1+y1 of those will be before the decimal point, so
even though it will return a result with 16383 digits after the
decimal point, lots of those digits will be completely wrong.

With the new code, you just multiply the numbers and the result is
correctly rounded to 16383 digits.

In general, I'd argue that using numeric isn't about getting exact
results, since nearly all real computations don't have an exact
result. Really, it's about getting high-precision results that would
be impossible with float8. (And if you don't care about numbers with
16383 digits after the decimal point, this change won't affect you.)

Regards,
Dean



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: [EXTERNAL] Re: Crash in record_type_typmod_compare
Next
From: Tomas Vondra
Date:
Subject: Re: Add ZSON extension to /contrib/