Thread: numeric/decimal docs bug?

numeric/decimal docs bug?

From

Tatsuo Ishii

Date:

02 March 2002, 09:15:43

In datatype.sgml:
    The type numeric can store numbers of practically    unlimited size and precision,...

I think this is simply wrong since the current implementation of
numeric and decimal data types limit the precision up to 1000.

#define NUMERIC_MAX_PRECISION        1000

Comments?
--
Tatsuo Ishii

Re: numeric/decimal docs bug?

From

Tom Lane

Date:

02 March 2002, 12:24:26

Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> In datatype.sgml:
>      The type numeric can store numbers of practically
>      unlimited size and precision,...

> I think this is simply wrong since the current implementation of
> numeric and decimal data types limit the precision up to 1000.

> #define NUMERIC_MAX_PRECISION        1000

I was thinking just the other day that there's no reason for that
limit to be so low.  Jan, couldn't we bump it up to 8 or 16K or so?

(Not that I'd care to do heavy arithmetic on such numbers, or that
I believe there's any practical use for them ... but why set the
limit lower than we must?)
        regards, tom lane

Re: numeric/decimal docs bug?

From

Lincoln Yeoh

Date:

04 March 2002, 22:34:48

Are there other cases where the pgsql docs may say unlimited where it might 
not be?

I remember when the FAQ stated unlimited columns per table (it's been 
corrected now so that's good).

Not asking for every limit to be documented but while documentation is 
written if one does not yet know (or remember) the actual (or even 
rough/estimated) limit it's better to skip it for later than to falsely say 
"unlimited". Better to have no signal than noise in this case.

Regards,
Link.

At 11:14 PM 02-03-2002 +0900, Tatsuo Ishii wrote:
>In datatype.sgml:
>
>      The type numeric can store numbers of practically
>      unlimited size and precision,...
>
>I think this is simply wrong since the current implementation of
>numeric and decimal data types limit the precision up to 1000.
>
>#define NUMERIC_MAX_PRECISION           1000
>
>Comments?

Re: numeric/decimal docs bug?

From

Peter Eisentraut

Date:

10 March 2002, 20:06:42

Tom Lane writes:

> > #define NUMERIC_MAX_PRECISION        1000
>
> I was thinking just the other day that there's no reason for that
> limit to be so low.  Jan, couldn't we bump it up to 8 or 16K or so?

Why have an arbitrary limit at all?  Set it to INT_MAX, or whatever the
index variables have for a type.

-- 
Peter Eisentraut   peter_e@gmx.net

Re: numeric/decimal docs bug?

From

Tom Lane

Date:

10 March 2002, 20:13:25

Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
> #define NUMERIC_MAX_PRECISION        1000
>> 
>> I was thinking just the other day that there's no reason for that
>> limit to be so low.  Jan, couldn't we bump it up to 8 or 16K or so?

> Why have an arbitrary limit at all?  Set it to INT_MAX,

The hard limit is certainly no more than 64K, since we store these
numbers in half of an atttypmod.  In practice I suspect the limit may
be less; Jan would be more likely to remember...
        regards, tom lane

Re: numeric/decimal docs bug?

From

Jan Wieck

Date:

11 March 2002, 16:20:29

Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > Tom Lane writes:
> > #define NUMERIC_MAX_PRECISION       1000
> >>
> >> I was thinking just the other day that there's no reason for that
> >> limit to be so low.  Jan, couldn't we bump it up to 8 or 16K or so?
>
> > Why have an arbitrary limit at all?  Set it to INT_MAX,
>
> The hard limit is certainly no more than 64K, since we store these
> numbers in half of an atttypmod.  In practice I suspect the limit may
> be less; Jan would be more likely to remember...
   It is arbitrary of course. I don't recall completely, have to   dig into the code, but there might be some side
effect when   mucking with it.

   The NUMERIC code increases the actual internal precision when   doing multiply and divide, what  happens  a
gazillion times   when  doing higher functions like trigonometry. I think there   was some connection between the max
precision and  how  high   this internal precision can grow, so increasing the precision   might affect the
computational performance  of  such  higher   functions significantly.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Re: numeric/decimal docs bug?

From

Bruce Momjian

Date:

11 March 2002, 17:05:15

Jan Wieck wrote:
> > The hard limit is certainly no more than 64K, since we store these
> > numbers in half of an atttypmod.  In practice I suspect the limit may
> > be less; Jan would be more likely to remember...
> 
>     It is arbitrary of course. I don't recall completely, have to
>     dig into the code, but there might be some side  effect  when
>     mucking with it.
> 
>     The NUMERIC code increases the actual internal precision when
>     doing multiply and divide, what  happens  a  gazillion  times
>     when  doing higher functions like trigonometry. I think there
>     was some connection between the max precision  and  how  high
>     this internal precision can grow, so increasing the precision
>     might affect the computational  performance  of  such  higher
>     functions significantly.

Oh, interesting, maybe we should just leave it alone.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: numeric/decimal docs bug?

From

Jan Wieck

Date:

11 March 2002, 17:35:57

Bruce Momjian wrote:
> Jan Wieck wrote:
> > > The hard limit is certainly no more than 64K, since we store these
> > > numbers in half of an atttypmod.  In practice I suspect the limit may
> > > be less; Jan would be more likely to remember...
> >
> >     It is arbitrary of course. I don't recall completely, have to
> >     dig into the code, but there might be some side  effect  when
> >     mucking with it.
> >
> >     The NUMERIC code increases the actual internal precision when
> >     doing multiply and divide, what  happens  a  gazillion  times
> >     when  doing higher functions like trigonometry. I think there
> >     was some connection between the max precision  and  how  high
> >     this internal precision can grow, so increasing the precision
> >     might affect the computational  performance  of  such  higher
> >     functions significantly.
>
> Oh, interesting, maybe we should just leave it alone.
   As  said, I have to look at the code. I'm pretty sure that it   currently will not use hundreds of digits internally
if  you   use  only  a  few digits in your schema. So changing it isn't   that dangerous.

   But who's going to write and run a regression test,  ensuring   that  the  new  high  limit can really be supported.
Ididn't   even run the numeric_big test lately, which  tests  with  500   digits  precision  at least ... and therefore
takessome time   (yawn). Increasing the number of digits used you  first  have   to  have  some  other  tool  to
generate the  test  data (I   originally used bc(1) with some scripts). Based  on  that  we   still  claim that our
systemdeals correctly with up to 1,000   digits precision.

   I don't like the idea of  bumping  up  that  number  to  some   higher  nonsense, claiming we support 32K digits
precisionon   exact numeric, and noone ever tested if  natural  log  really   returns  it's  result  in  that precision
insteadof a 30,000   digit precise approximation.

   I missed some of the discussion,  because  I  considered  the   1,000 digits already beeing complete nonsense and
droppedthe   thread. So could someone please enlighten me  what  the  real   reason  for  increasing  our  precision
is?  AFAIR  it  had   something to do with the docs. If it's just because the  docs   and  the code aren't in sync, I'd
votefor changing the docs.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Re: numeric/decimal docs bug?

From

Tatsuo Ishii

Date:

11 March 2002, 21:02:44

> Jan Wieck wrote:
> > > The hard limit is certainly no more than 64K, since we store these
> > > numbers in half of an atttypmod.  In practice I suspect the limit may
> > > be less; Jan would be more likely to remember...
> > 
> >     It is arbitrary of course. I don't recall completely, have to
> >     dig into the code, but there might be some side  effect  when
> >     mucking with it.
> > 
> >     The NUMERIC code increases the actual internal precision when
> >     doing multiply and divide, what  happens  a  gazillion  times
> >     when  doing higher functions like trigonometry. I think there
> >     was some connection between the max precision  and  how  high
> >     this internal precision can grow, so increasing the precision
> >     might affect the computational  performance  of  such  higher
> >     functions significantly.
> 
> Oh, interesting, maybe we should just leave it alone.

So are we going to just fix the docs?
--
Tatsuo Ishii

Re: numeric/decimal docs bug?

From

Bruce Momjian

Date:

11 April 2002, 17:40:03

Jan Wieck wrote:
> Bruce Momjian wrote:
> > Jan Wieck wrote:
> > > > The hard limit is certainly no more than 64K, since we store these
> > > > numbers in half of an atttypmod.  In practice I suspect the limit may
> > > > be less; Jan would be more likely to remember...
> > >
> > >     It is arbitrary of course. I don't recall completely, have to
> > >     dig into the code, but there might be some side  effect  when
> > >     mucking with it.
> > >
> > >     The NUMERIC code increases the actual internal precision when
> > >     doing multiply and divide, what  happens  a  gazillion  times
> > >     when  doing higher functions like trigonometry. I think there
> > >     was some connection between the max precision  and  how  high
> > >     this internal precision can grow, so increasing the precision
> > >     might affect the computational  performance  of  such  higher
> > >     functions significantly.
> >
> > Oh, interesting, maybe we should just leave it alone.
> 
>     As  said, I have to look at the code. I'm pretty sure that it
>     currently will not use hundreds of digits internally  if  you
>     use  only  a  few digits in your schema. So changing it isn't
>     that dangerous.
> 
>     But who's going to write and run a regression test,  ensuring
>     that  the  new  high  limit can really be supported. I didn't
>     even run the numeric_big test lately, which  tests  with  500
>     digits  precision  at least ... and therefore takes some time
>     (yawn). Increasing the number of digits used you  first  have
>     to  have  some  other  tool  to  generate  the  test  data (I
>     originally used bc(1) with some scripts). Based  on  that  we
>     still  claim that our system deals correctly with up to 1,000
>     digits precision.
> 
>     I don't like the idea of  bumping  up  that  number  to  some
>     higher  nonsense, claiming we support 32K digits precision on
>     exact numeric, and noone ever tested if  natural  log  really
>     returns  it's  result  in  that precision instead of a 30,000
>     digit precise approximation.
> 
>     I missed some of the discussion,  because  I  considered  the
>     1,000 digits already beeing complete nonsense and dropped the
>     thread. So could someone please enlighten me  what  the  real
>     reason  for  increasing  our  precision  is?   AFAIR  it  had
>     something to do with the docs. If it's just because the  docs
>     and  the code aren't in sync, I'd vote for changing the docs.

I have done a little more research on this.  If you create a numeric
with no precision:
CREATE TABLE test (x numeric);

You can insert numerics that are greater in length that 1000 digits:
INSERT INTO test values ('1111(continues 1010 times)');

You can even do computations on it:
SELECT x+1 FROM test;

1000 is pretty arbitrary.  If we can handle 1000, I can't see how larger
values somehow could fail.

Also, the numeric regression tests takes much longer than the other
tests.  I don't see why a test of that length is required, compared to
the other tests.  Probably time to pair it back a little.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: numeric/decimal docs bug?

From

Jan Wieck

Date:

12 April 2002, 09:50:17

Bruce Momjian wrote:
> Jan Wieck wrote:
> >
> >     I missed some of the discussion,  because  I  considered  the
> >     1,000 digits already beeing complete nonsense and dropped the
> >     thread. So could someone please enlighten me  what  the  real
> >     reason  for  increasing  our  precision  is?   AFAIR  it  had
> >     something to do with the docs. If it's just because the  docs
> >     and  the code aren't in sync, I'd vote for changing the docs.
>
> I have done a little more research on this.  If you create a numeric
> with no precision:
>
>    CREATE TABLE test (x numeric);
>
> You can insert numerics that are greater in length that 1000 digits:
>
>    INSERT INTO test values ('1111(continues 1010 times)');
>
> You can even do computations on it:
>
>    SELECT x+1 FROM test;
>
> 1000 is pretty arbitrary.  If we can handle 1000, I can't see how larger
> values somehow could fail.
   And  I  can't  see  what more than 1,000 digits would be good   for.  Bruce, your research is neat, but IMHO wasted
time.
   Why do we need to change it now? Is the more important  issue   (doing  the  internal  storage representation in
base10,000,   done yet? If not, we can open up for unlimited  precision  at   that time.

   Please,  adjust the docs for now, drop the issue and let's do   something useful.

> Also, the numeric regression tests takes much longer than the other
> tests.  I don't see why a test of that length is required, compared to
> the other tests.  Probably time to pair it back a little.
   What exactly do you mean with "pair it back"?  Shrinking  the   precision   of   the   test  or  reducing  it's
coverage of   functionality?

   For the former, it only uses 10 of the possible 1,000  digits   after  the  decimal  point.   Run the numeric_big
test(which   uses  800)  at  least  once  and  you'll  see  what  kind  of   difference precision makes.

   And  on  functionality,  it  is  absolutely  insufficient for   numerical functionality that  has  possible  carry,
rounding  etc.  issues,  to  check a function just for one single known   value, and if it computes that result
correctly,consider  it   OK for everything.

   I  thought  the  actual  test  is sloppy already ... but it's   still too much for you ... hmmmm.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: numeric/decimal docs bug?

From

Thomas Lockhart

Date:

12 April 2002, 09:52:41

...
> Also, the numeric regression tests takes much longer than the other
> tests.  I don't see why a test of that length is required, compared to
> the other tests.  Probably time to pair it back a little.

The numeric types are inherently slow. You might look at what effect you
can achieve by restructuring that regression test to more closely
resemble the other tests. In particular, it defines several source
tables, each one of which containing similar initial values. And it
defines a results table, into which intermediate results are placed,
which are then immediately queried for display and comparison to obtain
a test result. If handling the values is slow, we could certainly remove
these intermediate steps and still get most of the test coverage.

On another related topic:

I've been wanting to ask: we have in a few cases moved aggregate
calculations from small, fast data types to using numeric as the
accumulator. It would be nice imho to allow, say, an int8 accumulator
for an int4 data type, rather than requiring numeric.

But not all platforms (I assume) have an int8 data type. So we would
need to be able to fall back to numeric for those platforms which need
to use it. What would it take to make some of the catalogs configurable
or sensitive to configuration results?
                    - Thomas

Re: numeric/decimal docs bug?

From

Tom Lane

Date:

12 April 2002, 10:34:20

Thomas Lockhart <lockhart@fourpalms.org> writes:
> I've been wanting to ask: we have in a few cases moved aggregate
> calculations from small, fast data types to using numeric as the
> accumulator.

Which ones are you concerned about?  As of 7.2, the only ones that use
numeric accumulators for non-numeric input types are
aggname  |  basetype   |     aggtransfn      |  transtype
----------+-------------+---------------------+-------------avg      | int8        | int8_accum          | _numericsum
   | int8        | int8_sum            | numericstddev   | int2        | int2_accum          | _numericstddev   | int4
     | int4_accum          | _numericstddev   | int8        | int8_accum          | _numericvariance | int2        |
int2_accum         | _numericvariance | int4        | int4_accum          | _numericvariance | int8        | int8_accum
        | _numeric
 

All of these seem to have good precision/range arguments for using
numeric accumulators, or to be enough off the beaten track that it's
not worth much angst to optimize them.
        regards, tom lane

Re: numeric/decimal docs bug?

From

Thomas Lockhart

Date:

12 April 2002, 10:52:20

> Which ones are you concerned about?  As of 7.2, the only ones that use
> numeric accumulators for non-numeric input types are
...

OK, I did imply that I've been wanting to ask this for some time. I
should have asked during the 7.1 era, when this was true for more cases.
:)

> All of these seem to have good precision/range arguments for using
> numeric accumulators, or to be enough off the beaten track that it's
> not worth much angst to optimize them.

Well, they *are* on the beaten track for someone, just not you! ;)

I'd think that things like stddev might be OK with 52 bits of
accumulation, so could be done with doubles. Were they implemented that
way at one time? Do we have a need to provide precision greater than
that, or to guard against the (unlikely) case of having so many values
that a double-based accumulator overflows its ability to see the next
value?

I'll point out that for the case of accumulating so many integers that
they can't work with a double, the alternative implementation of using
numeric may approach infinite computation time.

But in any case, I can ask the same question, only reversed:

We now have some aggregate functions which use, say, int4 to accumulate
int4 values, if the target platform does *not* support int8. What would
it take to make the catalogs configurable or able to respond to
configuration results so that, for example, platforms without int8
support could instead use numeric or double values as a substitute?
                     - Thomas

Re: numeric/decimal docs bug?

From

Tom Lane

Date:

12 April 2002, 12:42:56

Thomas Lockhart <lockhart@fourpalms.org> writes:
>> All of these seem to have good precision/range arguments for using
>> numeric accumulators, or to be enough off the beaten track that it's
>> not worth much angst to optimize them.

> Well, they *are* on the beaten track for someone, just not you! ;)

> I'd think that things like stddev might be OK with 52 bits of
> accumulation, so could be done with doubles.

ISTM that people who are willing to have it done in a double can simply
write stddev(x::float8).  Of course you will rejoin that if they want
it done in a numeric, they can write stddev(x::numeric) ... but since
we are talking about exact inputs, I would prefer that the default
behavior be to carry out the summation without loss of precision.
The stddev calculation *is* subject to problems if you don't do the
summation as accurately as you can.

> Do we have a need to provide precision greater than
> that, or to guard against the (unlikely) case of having so many values
> that a double-based accumulator overflows its ability to see the next
> value?

You don't see the cancellation problems inherent in N*sum(x^2) - sum(x)^2?
You're likely to be subtracting bignums even with not all that many
input values; they just have to be large input values.

> But in any case, I can ask the same question, only reversed:

> We now have some aggregate functions which use, say, int4 to accumulate
> int4 values, if the target platform does *not* support int8. What would
> it take to make the catalogs configurable or able to respond to
> configuration results so that, for example, platforms without int8
> support could instead use numeric or double values as a substitute?

Haven't thought hard about it.  I will say that I don't like the idea
of changing the declared output type of the aggregates across platforms.
Changing the internal implementation (ie, transtype) would be acceptable
--- but I doubt it's worth the trouble.  In most other arguments that
touch on this point, I seem to be one of the few holdouts for insisting
that we worry about int8-less platforms anymore at all ;-).  For those
few old platforms, the 7.2 behavior of avg(int) and sum(int) is no worse
than it was for everyone in all pre-7.1 versions; I am not excited about
expending significant effort to make it better.
        regards, tom lane

Re: numeric/decimal docs bug?

From

Bruce Momjian

Date:

12 April 2002, 12:47:47

Jan Wieck wrote:
> Bruce Momjian wrote:
> > Jan Wieck wrote:
> > >
> > >     I missed some of the discussion,  because  I  considered  the
> > >     1,000 digits already beeing complete nonsense and dropped the
> > >     thread. So could someone please enlighten me  what  the  real
> > >     reason  for  increasing  our  precision  is?   AFAIR  it  had
> > >     something to do with the docs. If it's just because the  docs
> > >     and  the code aren't in sync, I'd vote for changing the docs.
> >
> > I have done a little more research on this.  If you create a numeric
> > with no precision:
> >
> >    CREATE TABLE test (x numeric);
> >
> > You can insert numerics that are greater in length that 1000 digits:
> >
> >    INSERT INTO test values ('1111(continues 1010 times)');
> >
> > You can even do computations on it:
> >
> >    SELECT x+1 FROM test;
> >
> > 1000 is pretty arbitrary.  If we can handle 1000, I can't see how larger
> > values somehow could fail.
> 
>     And  I  can't  see  what more than 1,000 digits would be good
>     for.  Bruce, your research is neat, but IMHO wasted time.
> 
>     Why do we need to change it now? Is the more important  issue
>     (doing  the  internal  storage representation in base 10,000,
>     done yet? If not, we can open up for unlimited  precision  at
>     that time.

I certainly would like the 10,000 change done, but few of us are
capable of doing it.  :-(

>     Please,  adjust the docs for now, drop the issue and let's do
>     something useful.

Thats how I got started.  The problem is that the limit isn't 1,000. 
Looking at NUMERIC_MAX_PRECISION, I see it used in gram.y to prevent
creation of NUMERIC columns that exceed the maximum length, and I see it
used in numeric.c to prevent exponients that exceed the maximum length,
but I don't see other cases that would actually enforce the limit in
INSERT and other cases.

Remember how people complained when I said "unlimited" in the FAQ for
some items that actually had a limit.  Well, in this case, we have a
limit that is only enforced in some places.  I would like to see this
cleared up on way or the other so the docs would be correct.

Jan, any chance on doing the 10,000 change in your spare time?  ;-)

> > Also, the numeric regression tests takes much longer than the other
> > tests.  I don't see why a test of that length is required, compared to
> > the other tests.  Probably time to pair it back a little.
> 
>     What exactly do you mean with "pair it back"?  Shrinking  the
>     precision   of   the   test  or  reducing  it's  coverage  of
>     functionality?
> 
>     For the former, it only uses 10 of the possible 1,000  digits
>     after  the  decimal  point.   Run the numeric_big test (which
>     uses  800)  at  least  once  and  you'll  see  what  kind  of
>     difference precision makes.
> 
>     And  on  functionality,  it  is  absolutely  insufficient for
>     numerical functionality that  has  possible  carry,  rounding
>     etc.  issues,  to  check a function just for one single known
>     value, and if it computes that result correctly, consider  it
>     OK for everything.
> 
>     I  thought  the  actual  test  is sloppy already ... but it's
>     still too much for you ... hmmmm.

Well, our regression tests are not intended to test every possible
NUMERIC combination, just a resonable subset.  As it is now, I often
think the regression tests have hung because numeric takes so much
longer than any of the other tests.  We have had this code in there for
a while now, and it is not OS-specific stuff, so I think we should just
pair it back so we know it is working.  We already have bignumeric for a
larger test.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: numeric/decimal docs bug?

From

Jan Wieck

Date:

12 April 2002, 14:12:20

Bruce Momjian wrote:
> Well, our regression tests are not intended to test every possible
> NUMERIC combination, just a resonable subset.  As it is now, I often
> think the regression tests have hung because numeric takes so much
> longer than any of the other tests.  We have had this code in there for
> a while now, and it is not OS-specific stuff, so I think we should just
> pair it back so we know it is working.  We already have bignumeric for a
> larger test.

Bruce,
   have  you even taken one single look at the test? It does 100   of each add, sub, mul and div, these are the fast
operations  that don't really take much time.

   Then it does 10 of each sqrt(), ln(), log10(), pow10() and 10   combined  power(ln()).   These   are   the   time
consuming  operations,   working   iterative  alas  Newton,  Taylor  and   McLaurin. All that is done with 10 digits
after the  decimal   point only!

   So  again,  WHAT  exactly  do  you  mean with "pair it back"?   Sorry, I don't get it. Do you want to remove the
entiretest?   Reduce  it  to  an  INSERT,  one  SELECT (so that we know the   input-  and  output  functions  work)
and the  four   basic   operators  used once? Well, that's a hell of a test, makes me   really feel comfortable. Like
the mechanic  kicking  against   the  tire  then  saying  "I  ain't  see noth'n wrong with the   brakes, ya sure can
makea trip in the mountains".  Yeah,  at   least once!

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: numeric/decimal docs bug?

From

Bruce Momjian

Date:

12 April 2002, 17:31:41

Jan Wieck wrote:
> Bruce Momjian wrote:
> > Well, our regression tests are not intended to test every possible
> > NUMERIC combination, just a resonable subset.  As it is now, I often
> > think the regression tests have hung because numeric takes so much
> > longer than any of the other tests.  We have had this code in there for
> > a while now, and it is not OS-specific stuff, so I think we should just
> > pair it back so we know it is working.  We already have bignumeric for a
> > larger test.
> 
> Bruce,
> 
>     have  you even taken one single look at the test? It does 100
>     of each add, sub, mul and div, these are the fast  operations
>     that don't really take much time.
> 
>     Then it does 10 of each sqrt(), ln(), log10(), pow10() and 10
>     combined  power(ln()).   These   are   the   time   consuming
>     operations,   working   iterative  alas  Newton,  Taylor  and
>     McLaurin. All that is done with 10 digits after  the  decimal
>     point only!
> 
>     So  again,  WHAT  exactly  do  you  mean with "pair it back"?
>     Sorry, I don't get it. Do you want to remove the entire test?
>     Reduce  it  to  an  INSERT,  one  SELECT (so that we know the
>     input-  and  output  functions  work)  and  the  four   basic
>     operators  used once? Well, that's a hell of a test, makes me
>     really feel comfortable. Like the  mechanic  kicking  against
>     the  tire  then  saying  "I  ain't  see noth'n wrong with the
>     brakes, ya sure can make a trip in the mountains".  Yeah,  at
>     least once!

Jan, regression is not a test of the level a developer would use to make
sure his code works.  It is merely to make sure the install works on a
limited number of cases.  Having seen zero reports of any numeric
failures since we installed it, and seeing it takes >10x times longer
than the other tests, I think it should be paired back.  Do we really
need 10 tests of each complex function?  I think one would do the trick.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: numeric/decimal docs bug?

From

Jan Wieck

Date:

12 April 2002, 18:19:48

Bruce Momjian wrote:
> Jan Wieck wrote:
> > Bruce Momjian wrote:
> > > Well, our regression tests are not intended to test every possible
> > > NUMERIC combination, just a resonable subset.  As it is now, I often
> > > think the regression tests have hung because numeric takes so much
> > > longer than any of the other tests.  We have had this code in there for
> > > a while now, and it is not OS-specific stuff, so I think we should just
> > > pair it back so we know it is working.  We already have bignumeric for a
> > > larger test.
> >
> > Bruce,
> >
> >     have  you even taken one single look at the test? It does 100
> >     of each add, sub, mul and div, these are the fast  operations
> >     that don't really take much time.
> >
> >     Then it does 10 of each sqrt(), ln(), log10(), pow10() and 10
> >     combined  power(ln()).   These   are   the   time   consuming
> >     operations,   working   iterative  alas  Newton,  Taylor  and
> >     McLaurin. All that is done with 10 digits after  the  decimal
> >     point only!
> >
> >     So  again,  WHAT  exactly  do  you  mean with "pair it back"?
> >     Sorry, I don't get it. Do you want to remove the entire test?
> >     Reduce  it  to  an  INSERT,  one  SELECT (so that we know the
> >     input-  and  output  functions  work)  and  the  four   basic
> >     operators  used once? Well, that's a hell of a test, makes me
> >     really feel comfortable. Like the  mechanic  kicking  against
> >     the  tire  then  saying  "I  ain't  see noth'n wrong with the
> >     brakes, ya sure can make a trip in the mountains".  Yeah,  at
> >     least once!
>
> Jan, regression is not a test of the level a developer would use to make
> sure his code works.  It is merely to make sure the install works on a
> limited number of cases.  Having seen zero reports of any numeric
> failures since we installed it, and seeing it takes >10x times longer
> than the other tests, I think it should be paired back.  Do we really
> need 10 tests of each complex function?  I think one would do the trick.
   You  forgot  who  wrote  that  code  originally.  I feel alot   better WITH the tests in place :-)
   And if it's merely to make sure the install worked,  man  who   is  doing  source  installations  these  days  and
runs the   regression tests anyway?  Most people throw in a RPM  or  the   like, only a few serious users install from
sources,and only   a fistfull of them then runs regression.
 
   Aren't it mostly developers and  distro-maintainers  who  use   that  directory?  I  think your entire point isn't
justweak,   IMNSVHO you don't really have a point.
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: numeric/decimal docs bug?

From

Bruce Momjian

Date:

12 April 2002, 18:24:23

Jan Wieck wrote:
>     You  forgot  who  wrote  that  code  originally.  I feel alot
>     better WITH the tests in place :-)
> 
>     And if it's merely to make sure the install worked,  man  who
>     is  doing  source  installations  these  days  and  runs  the
>     regression tests anyway?  Most people throw in a RPM  or  the
>     like, only a few serious users install from sources, and only
>     a fistfull of them then runs regression.
> 
>     Aren't it mostly developers and  distro-maintainers  who  use
>     that  directory?  I  think your entire point isn't just weak,
>     IMNSVHO you don't really have a point.

It is my understanding that RPM does run that test.  My main issue is
why does numeric have to be so much larger than the other tests?  I have
not heard that explained.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: numeric/decimal docs bug?

From

Thomas Lockhart

Date:

12 April 2002, 18:25:27

...
> Jan, regression is not a test of the level a developer would use to make
> sure his code works.  It is merely to make sure the install works on a
> limited number of cases.  Having seen zero reports of any numeric
> failures since we installed it, and seeing it takes >10x times longer
> than the other tests, I think it should be paired back.  Do we really
> need 10 tests of each complex function?  I think one would do the trick.

Whoops. We rely on the regression tests to make sure that previous
behaviors continue to be valid behaviors. Another use is to verify that
a particular installation can reproduce this same test. But regression
testing is a fundamental and essential development tool, precisely
because it covers cases outside the range you might be thinking of
testing as you do development.

As a group, we might tend to underestimate the value of this, which
could be evidenced by the fact that our regression test suite has not
grown substantially more than it has over the years. It could have many
more tests within each module, and bug reports *could* be fed back into
regression updates to make sure that failures do not reappear.

All imho of course ;)
                    - Thomas

Re: numeric/decimal docs bug?

From

Thomas Lockhart

Date:

12 April 2002, 18:33:56

...
> It is my understanding that RPM does run that test.  My main issue is
> why does numeric have to be so much larger than the other tests?  I have
> not heard that explained.

afaict it is not larger. It *does* take more time, but the number of
tests is relatively small, or at least compatible with the number of
tests which appear, or should appear, in other tests of data types
covering a large problem space (e.g. date/time).

It does illustrate that BCD-like encodings are expensive, and that
machine-supported math is usually a win. If it is a big deal, jump in
and widen the internal math operations!
                  - Thomas

Re: numeric/decimal docs bug?

From

Jan Wieck

Date:

12 April 2002, 18:53:12

Bruce Momjian wrote:
> Jan Wieck wrote:
> >     You  forgot  who  wrote  that  code  originally.  I feel alot
> >     better WITH the tests in place :-)
> >
> >     And if it's merely to make sure the install worked,  man  who
> >     is  doing  source  installations  these  days  and  runs  the
> >     regression tests anyway?  Most people throw in a RPM  or  the
> >     like, only a few serious users install from sources, and only
> >     a fistfull of them then runs regression.
> >
> >     Aren't it mostly developers and  distro-maintainers  who  use
> >     that  directory?  I  think your entire point isn't just weak,
> >     IMNSVHO you don't really have a point.
>
> It is my understanding that RPM does run that test.  My main issue is
> why does numeric have to be so much larger than the other tests?  I have
> not heard that explained.
   Well,  I  heard  Thomas  commenting  that  it's horribly slow   implemented (or so, don't recall  his  exact
wording).  But   he's right.

   I think the same test done with float8 would run in less than   a tenth of that time. This is only  an  explanation
"why it   takes so long"? It is no argument pro or con the test itself.

   I think I made my point clear enough, that I consider calling   these  functions  just once is plain sloppy.  But
that'sjust   my opinion. What do others think?

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: numeric/decimal docs bug?

From

Tom Lane

Date:

12 April 2002, 19:04:41

Jan Wieck <janwieck@yahoo.com> writes:
>     I think I made my point clear enough, that I consider calling
>     these  functions  just once is plain sloppy.  But that's just
>     my opinion. What do others think?

I don't have a problem with the current length of the numeric test.
The original form of it (now shoved over to bigtests) did seem
excessively slow to me ... but I can live with this one.

I do agree that someone ought to reimplement numeric using base10k
arithmetic ... but it's not bugging me so much that I'm likely
to get around to it anytime soon myself ...

Bruce, why is there no TODO item for that project?
        regards, tom lane

Re: numeric/decimal docs bug?

From

Bruce Momjian

Date:

12 April 2002, 20:13:22

Thomas Lockhart wrote:
> ...
> > It is my understanding that RPM does run that test.  My main issue is
> > why does numeric have to be so much larger than the other tests?  I have
> > not heard that explained.
> 
> afaict it is not larger. It *does* take more time, but the number of
> tests is relatively small, or at least compatible with the number of
> tests which appear, or should appear, in other tests of data types
> covering a large problem space (e.g. date/time).
> 
> It does illustrate that BCD-like encodings are expensive, and that
> machine-supported math is usually a win. If it is a big deal, jump in
> and widen the internal math operations!

OK, as long as everyone else is fine with the tests, we can leave it
alone.  The concept that the number of tests is realisitic, and that
they are just slower than other data types, makes sense.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: numeric/decimal docs bug?

From

Bruce Momjian

Date:

12 April 2002, 20:14:14

Tom Lane wrote:
> Jan Wieck <janwieck@yahoo.com> writes:
> >     I think I made my point clear enough, that I consider calling
> >     these  functions  just once is plain sloppy.  But that's just
> >     my opinion. What do others think?
> 
> I don't have a problem with the current length of the numeric test.
> The original form of it (now shoved over to bigtests) did seem
> excessively slow to me ... but I can live with this one.
> 
> I do agree that someone ought to reimplement numeric using base10k
> arithmetic ... but it's not bugging me so much that I'm likely
> to get around to it anytime soon myself ...
> 
> Bruce, why is there no TODO item for that project?

Not sure.  I was aware of it for a while.  Added:
* Change NUMERIC data type to use base 10,000 internally

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: numeric/decimal docs bug?

From

Bruce Momjian

Date:

12 April 2002, 21:38:06

Tatsuo Ishii wrote:
> > Jan Wieck wrote:
> > > > The hard limit is certainly no more than 64K, since we store these
> > > > numbers in half of an atttypmod.  In practice I suspect the limit may
> > > > be less; Jan would be more likely to remember...
> > >
> > >     It is arbitrary of course. I don't recall completely, have to
> > >     dig into the code, but there might be some side  effect  when
> > >     mucking with it.
> > >
> > >     The NUMERIC code increases the actual internal precision when
> > >     doing multiply and divide, what  happens  a  gazillion  times
> > >     when  doing higher functions like trigonometry. I think there
> > >     was some connection between the max precision  and  how  high
> > >     this internal precision can grow, so increasing the precision
> > >     might affect the computational  performance  of  such  higher
> > >     functions significantly.
> >
> > Oh, interesting, maybe we should just leave it alone.
>
> So are we going to just fix the docs?

OK, I have updated the docs.  Patch attached.

I have also added this to the TODO list:

    * Change NUMERIC to enforce the maximum precision, and increase it

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
Index: datatype.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v
retrieving revision 1.87
diff -c -r1.87 datatype.sgml
*** datatype.sgml    3 Apr 2002 05:39:27 -0000    1.87
--- datatype.sgml    13 Apr 2002 01:26:54 -0000
***************
*** 506,518 ****
      <title>Arbitrary Precision Numbers</title>

      <para>
!      The type <type>numeric</type> can store numbers of practically
!      unlimited size and precision, while being able to store all
!      numbers and carry out all calculations exactly.  It is especially
!      recommended for storing monetary amounts and other quantities
!      where exactness is required.  However, the <type>numeric</type>
!      type is very slow compared to the floating-point types described
!      in the next section.
      </para>

      <para>
--- 506,517 ----
      <title>Arbitrary Precision Numbers</title>

      <para>
!      The type <type>numeric</type> can store numbers with up to 1,000
!      digits of precision and perform calculations exactly. It is
!      especially recommended for storing monetary amounts and other
!      quantities where exactness is required. However, the
!      <type>numeric</type> type is very slow compared to the
!      floating-point types described in the next section.
      </para>

      <para>

Re: numeric/decimal docs bug?

From

Bruce Momjian

Date:

12 April 2002, 21:39:40

Jan Wieck wrote:
> > Oh, interesting, maybe we should just leave it alone.
> 
>     As  said, I have to look at the code. I'm pretty sure that it
>     currently will not use hundreds of digits internally  if  you
>     use  only  a  few digits in your schema. So changing it isn't
>     that dangerous.
> 
>     But who's going to write and run a regression test,  ensuring
>     that  the  new  high  limit can really be supported. I didn't
>     even run the numeric_big test lately, which  tests  with  500
>     digits  precision  at least ... and therefore takes some time
>     (yawn). Increasing the number of digits used you  first  have
>     to  have  some  other  tool  to  generate  the  test  data (I
>     originally used bc(1) with some scripts). Based  on  that  we
>     still  claim that our system deals correctly with up to 1,000
>     digits precision.
> 
>     I don't like the idea of  bumping  up  that  number  to  some
>     higher  nonsense, claiming we support 32K digits precision on
>     exact numeric, and noone ever tested if  natural  log  really
>     returns  it's  result  in  that precision instead of a 30,000
>     digit precise approximation.
> 
>     I missed some of the discussion,  because  I  considered  the
>     1,000 digits already beeing complete nonsense and dropped the
>     thread. So could someone please enlighten me  what  the  real
>     reason  for  increasing  our  precision  is?   AFAIR  it  had
>     something to do with the docs. If it's just because the  docs
>     and  the code aren't in sync, I'd vote for changing the docs.

Jan, if the numeric code works on 100 or 500 digits, could it break with
10,000 digits.  Is there a reason to believe longer digits could cause
problems not present in shorter tests?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: numeric/decimal docs bug?

From

"Christopher Kings-Lynne"

Date:

13 April 2002, 02:39:27

> Jan, regression is not a test of the level a developer would use to make
> sure his code works.  It is merely to make sure the install works on a
> limited number of cases.

News to me!  If anything, I don't think a lot of the current regression
tests are comprehensive enough!  For the SET/DROP NOT NULL patch I
submitted, I included a regression test that tests every one of the
preconditions in my code - that way if anything gets changed or broken,
we'll find out very quickly.

I personally don't have a problem with the time taken to regression test -
and I think that trimming the numeric test _might_ be a false economy.  Who
knows what's going to turn around and bite us oneday?

>  Having seen zero reports of any numeric
> failures since we installed it, and seeing it takes >10x times longer
> than the other tests, I think it should be paired back.  Do we really
> need 10 tests of each complex function?  I think one would do the trick.

A good point tho, I didn't submit a regression test that tries to ALTER 3
different non-existent tables to check for failures - one test was enough...

Chris

Re: numeric/decimal docs bug?

From

Bruce Momjian

Date:

13 April 2002, 10:35:01

Christopher Kings-Lynne wrote:
> >  Having seen zero reports of any numeric
> > failures since we installed it, and seeing it takes >10x times longer
> > than the other tests, I think it should be paired back.  Do we really
> > need 10 tests of each complex function?  I think one would do the trick.
> 
> A good point tho, I didn't submit a regression test that tries to ALTER 3
> different non-existent tables to check for failures - one test was enough...

That was my point.  Is there much value in testing each function ten
times.  Anyway, seems only I care so I will drop it.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: numeric/decimal docs bug?

From

Jan Wieck

Date:

13 April 2002, 12:51:35

Bruce Momjian wrote:
> Christopher Kings-Lynne wrote:
> > >  Having seen zero reports of any numeric
> > > failures since we installed it, and seeing it takes >10x times longer
> > > than the other tests, I think it should be paired back.  Do we really
> > > need 10 tests of each complex function?  I think one would do the trick.
> >
> > A good point tho, I didn't submit a regression test that tries to ALTER 3
> > different non-existent tables to check for failures - one test was enough...
>
> That was my point.  Is there much value in testing each function ten
> times.  Anyway, seems only I care so I will drop it.
   Yes  there  is  value  in it. There is conditional code in it   that depends on the values. I wrote that before (I
saidthere   are  possible  carry, rounding etc. issues), and it looked to   me that you simply ignored these facts.
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #