Re: numeric/decimal docs bug? - Mailing list pgsql-hackers

From Tom Lane
Subject Re: numeric/decimal docs bug?
Date
Msg-id 27815.1018629679@sss.pgh.pa.us
Whole thread Raw
In response to Re: numeric/decimal docs bug?  (Thomas Lockhart <lockhart@fourpalms.org>)
List pgsql-hackers
Thomas Lockhart <lockhart@fourpalms.org> writes:
>> All of these seem to have good precision/range arguments for using
>> numeric accumulators, or to be enough off the beaten track that it's
>> not worth much angst to optimize them.

> Well, they *are* on the beaten track for someone, just not you! ;)

> I'd think that things like stddev might be OK with 52 bits of
> accumulation, so could be done with doubles.

ISTM that people who are willing to have it done in a double can simply
write stddev(x::float8).  Of course you will rejoin that if they want
it done in a numeric, they can write stddev(x::numeric) ... but since
we are talking about exact inputs, I would prefer that the default
behavior be to carry out the summation without loss of precision.
The stddev calculation *is* subject to problems if you don't do the
summation as accurately as you can.

> Do we have a need to provide precision greater than
> that, or to guard against the (unlikely) case of having so many values
> that a double-based accumulator overflows its ability to see the next
> value?

You don't see the cancellation problems inherent in N*sum(x^2) - sum(x)^2?
You're likely to be subtracting bignums even with not all that many
input values; they just have to be large input values.

> But in any case, I can ask the same question, only reversed:

> We now have some aggregate functions which use, say, int4 to accumulate
> int4 values, if the target platform does *not* support int8. What would
> it take to make the catalogs configurable or able to respond to
> configuration results so that, for example, platforms without int8
> support could instead use numeric or double values as a substitute?

Haven't thought hard about it.  I will say that I don't like the idea
of changing the declared output type of the aggregates across platforms.
Changing the internal implementation (ie, transtype) would be acceptable
--- but I doubt it's worth the trouble.  In most other arguments that
touch on this point, I seem to be one of the few holdouts for insisting
that we worry about int8-less platforms anymore at all ;-).  For those
few old platforms, the 7.2 behavior of avg(int) and sum(int) is no worse
than it was for everyone in all pre-7.1 versions; I am not excited about
expending significant effort to make it better.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Barry Lind
Date:
Subject: Re: 7.3 schedule
Next
From: Bruce Momjian
Date:
Subject: Re: numeric/decimal docs bug?