Thread: numeric data type on 6.5

numeric data type on 6.5

From
Tatsuo Ishii
Date:
I thought numeric data type on 6.5 allows a very large precision. Am I 
missing something?
--
Tatsuo Ishii

test=> create table t1(n numeric(100,0));
CREATE
test=> \d t1;
Table    = t1
+----------------------------------+----------------------------------+-------+
|              Field               |              Type                | Length|
+----------------------------------+----------------------------------+-------+
| n                                | numeric                          |   var |
+----------------------------------+----------------------------------+-------+
test=> insert into t1 values(100000000000000000000000000000);
NOTICE:  Integer input '100000000000000000000000000000' is out of range; promoted to float
INSERT 149033 1
test=> select * from t1;
n
-
1
(1 row)


Re: [HACKERS] numeric data type on 6.5

From
"D'Arcy" "J.M." Cain
Date:
Thus spake Tatsuo Ishii
> I thought numeric data type on 6.5 allows a very large precision. Am I 
> missing something?
[...]
> test=> insert into t1 values(100000000000000000000000000000);
> NOTICE:  Integer input '100000000000000000000000000000' is out of range; promoted to float

Try this.
insert into t1 values('100000000000000000000000000000'::numeric);

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 424 2871     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: [HACKERS] numeric data type on 6.5

From
Tatsuo Ishii
Date:
>> I thought numeric data type on 6.5 allows a very large precision. Am I 
>> missing something?
>[...]
>> test=> insert into t1 values(100000000000000000000000000000);
>> NOTICE:  Integer input '100000000000000000000000000000' is out of range; promoted to float
>
>Try this.
>insert into t1 values('100000000000000000000000000000'::numeric);

Thanks. It definitely works!
--
Tatsuo Ishii


Re: [HACKERS] numeric data type on 6.5

From
jwieck@debis.com (Jan Wieck)
Date:
>
> >> I thought numeric data type on 6.5 allows a very large precision. Am I
> >> missing something?
> >[...]
> >> test=> insert into t1 values(100000000000000000000000000000);
> >> NOTICE:  Integer input '100000000000000000000000000000' is out of range; promoted to float
> >
> >Try this.
> >insert into t1 values('100000000000000000000000000000'::numeric);
>
> Thanks. It definitely works!

    insert into t1 values('100000000000000000000000000000');


    That one too.

    The  problem is that the yacc parser already tries to convert
    it into an integer or float if you omit the quotes. I'll  try
    to  implement  a  NUMERIC  fallback for this case for 6.6 and
    then have all the auto conversion functionality  so  NUMERIC,
    INTEGER and FLOAT can be used mixed.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] numeric data type on 6.5

From
Thomas Lockhart
Date:
>     The  problem is that the yacc parser already tries to convert
>     it into an integer or float if you omit the quotes. I'll  try
>     to  implement  a  NUMERIC  fallback for this case for 6.6 and
>     then have all the auto conversion functionality  so  NUMERIC,
>     INTEGER and FLOAT can be used mixed.

I'm looking at this right now. I had coded in a fallback to FLOAT8 for
the integer types because at the time that was the only other useful
numeric type. However, I'm going to try changing the code to leave a
failed INTx token as a string of unspecified type, which would be
typed and converted later using the automatic coersion mechanism.

istm that this would be a no-brainer for v6.5, since it is just
replacing one heuristic with a more general and more correct one. And
I had implemented both, so we know who to blame :)

Will let y'all know how it goes...
                      - Tom

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] numeric data type on 6.5

From
Tom Lane
Date:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> I'm looking at this right now. I had coded in a fallback to FLOAT8 for
> the integer types because at the time that was the only other useful
> numeric type. However, I'm going to try changing the code to leave a
> failed INTx token as a string of unspecified type, which would be
> typed and converted later using the automatic coersion mechanism.

That would be good as far as it goes, but what about cases with a
decimal point in 'em?  Converting to float and then to numeric will
lose precision.

I'm inclined to think you should prevent the parser from converting
*any* numeric constant out of string form until it knows the target data
type.

(IIRC, INT8 has problems similar to NUMERIC's...)
        regards, tom lane


Re: [HACKERS] numeric data type on 6.5

From
Thomas Lockhart
Date:
> >     The  problem is that the yacc parser already tries to convert
> >     it into an integer or float if you omit the quotes.
> ... I'm going to try changing the code to leave a
> failed INTx token as a string of unspecified type, which would be
> typed and converted later using the automatic coersion mechanism.

OK, this seems to work:

postgres=> create table t1 (n numeric(20,0));
CREATE
postgres=> insert into t1 values ('10000000000000000000');
INSERT 18552 1
postgres=> insert into t1 values (20000000000000000000);
INSERT 18553 1
postgres=> select * from t1;                  n
--------------------
10000000000000000000
20000000000000000000

postgres=> select n * 5000000000000000000000000000000 from t1;
---------------------------------------------------50000000000000000000000000000000000000000000000000
100000000000000000000000000000000000000000000000000

But, there are some cases which aren't transparent:

postgres=> select 10000000000000000000000000*2;
ERROR:  pg_atoi: error reading "10000000000000000000000000": Numerical
result out of range
postgres=> select 10000000000000000000000000*2::numeric;
--------------------------
20000000000000000000000000

And, if a long numeric string is entered, it actually stays a string
all the way through (never being converted to anything internal):

postgres=> select 400000000000000000000000000000000000000000000000000;
---------------------------------------------------
400000000000000000000000000000000000000000000000000

Comments?

btw, I've got some float8->numeric conversion troubles on my
i686/Linux box:

postgres=> insert into t1 values ('30000000000000000000'::float8);
INSERT 18541 1
postgres=> select * from t1;                n
------------------                3

Any ideas on this last one?

I'm running from this morning's development tree...
                           - Tom

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] numeric data type on 6.5

From
Thomas Lockhart
Date:
> btw, I've got some float8->numeric conversion troubles on my
> i686/Linux box:
> postgres=> insert into t1 values ('30000000000000000000'::float8);
> INSERT 18541 1
> postgres=> select * from t1;
>                  n
> ------------------
>                  3

OK, so the problem is that large floats are printed using exponential
notation, and the float8->numeric conversion routine uses the
float8out() routine to convert to a string in preparation for
ingestion as a numeric type. I've modified my copy of float8_numeric()
to instead print directly into a (large!) buffer using the "%f"
specifier, to ensure that the string is always compatible with the
numeric reader:

postgres=> create table t1 (f float8, n numeric(20,2), d
decimal(20,2));
CREATE
postgres=> insert into t1 values (300.1);
INSERT 18641 1
postgres=> insert into t1 values (300000000000000000);
INSERT 18642 1
postgres=> update t1 set n = f, d = f;
UPDATE 2
postgres=> select * from t1;
f    |                    n|                    d
-----+---------------------+---------------------
300.1|               300.10|               300.10
3e+17|300000000000000000.00|300000000000000000.00
(2 rows)

The float8_numeric() code already had checked for NULL and NaN, so I
think this does not lose functionality. What do you think Jan? Should
I make the change? Or is there another way??
                     - Tom

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] numeric data type on 6.5

From
jwieck@debis.com (Jan Wieck)
Date:
Thomas Lockhart wrote:

> The float8_numeric() code already had checked for NULL and NaN, so I
> think this does not lose functionality. What do you think Jan? Should
> I make the change? Or is there another way??

    Think it's O.K. - commit the changes.

    The  other way would be to enhance the NUMERIC input function
    to read exponential notation. But  I  wouldn't  do  this  now
    because  I've planned to someday implement NUMERIC again from
    scratch.

    The current implementation has a packed  storage  format  and
    the  arithmetic  operations  are  based on a character format
    (each digit is stored in one byte).  After thinking about  it
    I  discovered,  that  storing  the  value internally in short
    int's (16 bit) and base 10000 would have some advantages.

    1.  No need to pack/unpack storage format for computations.

    2.  One arithmetic operation in  the  innermost  loops  (only
        add/subtract  are really implemented) mucks with 4 digits
        at a time.

    The disadvantages are small. Base 10000 to base 10  (decimal)
    conversion  is easily to parse/print. Only rounding functions
    will be a little tricky.  I think  the  speedup  gained  from
    adding/subtracting  4 digits per loop iteration will be worth
    the efford.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] numeric data type on 6.5

From
Thomas Lockhart
Date:
> > I'm looking at this right now. I had coded in a fallback to FLOAT8 for
> > the integer types because at the time that was the only other useful
> > numeric type. However, I'm going to try changing the code to leave a
> > failed INTx token as a string of unspecified type, which would be
> > typed and converted later using the automatic coersion mechanism.
> That would be good as far as it goes, but what about cases with a
> decimal point in 'em?  Converting to float and then to numeric will
> lose precision.
> I'm inclined to think you should prevent the parser from converting
> *any* numeric constant out of string form until it knows the target data
> type.
> (IIRC, INT8 has problems similar to NUMERIC's...)

Right. Here is a patch which tries to do something right for most
cases. For the "integer" token (numbers w/o a decimal point) it keeps
the token as a string if the conversion to int4 fails. I split the
"real" token into "decimal" (w/o exponent) and "real"; at the moment
"decimal" is forced to become a float8 if there are fewer than 18
characters in the string, but there may be a more robust strategy to
be had.

When a numeric token is kept as a string, the parser requires some
typing context to handle the string later, otherwise it will complain.
But that is probably better than silently swallowing numeric data and
possibly mishandling it.

Seems to do OK with numeric tokens of unspecified type which will
become int8 and numeric in the parser. There may be some edge-effect
cases (e.g. decimal data with 17 characters) which aren't quite right.

Comments?

                         - Tom

--
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California*** ../src/backend/parser/scan.l.orig    Tue Mar 30 15:08:02 1999
--- ../src/backend/parser/scan.l    Tue Apr 27 16:29:21 1999
***************
*** 152,161 ****
  xmstop            -

  integer            [\-]?{digit}+
  /*
- real            [\-]?{digit}+\.{digit}+([Ee][-+]?{digit}+)?
- */
  real            [\-]?(((({digit}*\.{digit}+)|({digit}+\.{digit}*))([Ee][-+]?{digit}+)?)|({digit}+[Ee][-+]?{digit}+))

  param            \${integer}

--- 152,162 ----
  xmstop            -

  integer            [\-]?{digit}+
+ decimal            [\-]?(({digit}*\.{digit}+)|({digit}+\.{digit}*))
+ real            [\-]?((({digit}*\.{digit}+)|({digit}+\.{digit}*)|({digit}+))([Ee][-+]?{digit}+))
  /*
  real            [\-]?(((({digit}*\.{digit}+)|({digit}+\.{digit}*))([Ee][-+]?{digit}+)?)|({digit}+[Ee][-+]?{digit}+))
+ */

  param            \${integer}

***************
*** 334,348 ****
--- 335,369 ----
                      if (*endptr != '\0' || errno == ERANGE)
                      {
                          errno = 0;
+ #if 0
                          yylval.dval = strtod(((char *)yytext),&endptr);
                          if (*endptr != '\0' || errno == ERANGE)
                              elog(ERROR,"Bad integer input '%s'",yytext);
                          CheckFloat8Val(yylval.dval);
                          elog(NOTICE,"Integer input '%s' is out of range; promoted to float", yytext);
                          return FCONST;
+ #endif
+                         yylval.str = pstrdup((char*)yytext);
+                         return SCONST;
                      }
                      return ICONST;
                  }
+ {decimal}/{space}*-{number} {
+                     char* endptr;
+
+                     BEGIN(xm);
+                     if (strlen((char *)yytext) <= 17)
+                     {
+                         errno = 0;
+                         yylval.dval = strtod(((char *)yytext),&endptr);
+                         if (*endptr != '\0' || errno == ERANGE)
+                             elog(ERROR,"Bad float8 input '%s'",yytext);
+                         CheckFloat8Val(yylval.dval);
+                         return FCONST;
+                     }
+                     yylval.str = pstrdup((char*)yytext);
+                     return SCONST;
+                 }
  {real}/{space}*-{number} {
                      char* endptr;

***************
*** 362,375 ****
--- 383,415 ----
                      if (*endptr != '\0' || errno == ERANGE)
                      {
                          errno = 0;
+ #if 0
                          yylval.dval = strtod(((char *)yytext),&endptr);
                          if (*endptr != '\0' || errno == ERANGE)
                              elog(ERROR,"Bad integer input '%s'",yytext);
                          CheckFloat8Val(yylval.dval);
                          elog(NOTICE,"Integer input '%s' is out of range; promoted to float", yytext);
                          return FCONST;
+ #endif
+                         yylval.str = pstrdup((char*)yytext);
+                         return SCONST;
                      }
                      return ICONST;
+                 }
+ {decimal}        {
+                     char* endptr;
+
+                     if (strlen((char *)yytext) <= 17)
+                     {
+                         errno = 0;
+                         yylval.dval = strtod((char *)yytext,&endptr);
+                         if (*endptr != '\0' || errno == ERANGE)
+                             elog(ERROR,"Bad float input '%s'",yytext);
+                         CheckFloat8Val(yylval.dval);
+                         return FCONST;
+                     }
+                     yylval.str = pstrdup((char*)yytext);
+                     return SCONST;
                  }
  {real}            {
                      char* endptr;

Re: [HACKERS] numeric data type on 6.5

From
Bruce Momjian
Date:
> > > I'm looking at this right now. I had coded in a fallback to FLOAT8 for
> > > the integer types because at the time that was the only other useful
> > > numeric type. However, I'm going to try changing the code to leave a
> > > failed INTx token as a string of unspecified type, which would be
> > > typed and converted later using the automatic coersion mechanism.
> > That would be good as far as it goes, but what about cases with a
> > decimal point in 'em?  Converting to float and then to numeric will
> > lose precision.
> > I'm inclined to think you should prevent the parser from converting
> > *any* numeric constant out of string form until it knows the target data
> > type.
> > (IIRC, INT8 has problems similar to NUMERIC's...)
> 
> Right. Here is a patch which tries to do something right for most
> cases. For the "integer" token (numbers w/o a decimal point) it keeps
> the token as a string if the conversion to int4 fails. I split the
> "real" token into "decimal" (w/o exponent) and "real"; at the moment
> "decimal" is forced to become a float8 if there are fewer than 18
> characters in the string, but there may be a more robust strategy to
> be had.

This seems like a perfect approach.



--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] numeric data type on 6.5

From
Tom Lane
Date:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> Seems to do OK with numeric tokens of unspecified type which will
> become int8 and numeric in the parser. There may be some edge-effect
> cases (e.g. decimal data with 17 characters) which aren't quite right.
> Comments?

I'd suggest backing off one more place on the length of string you will
try to convert to a float8.  Since the test is strlen() <= 17, you
actually can have at most 16 digits (there must be a decimal point in
there too).  But IEEE float is only good to 16-and-change digits; I'm
not sure I'd want to assume that the 16th digit will always be
reproduced exactly.  15 digits would be safer.

It could still break if the C library's float<=>string conversion
routines are sloppy :-(.  I suppose you're interested in preserving
the info that "this constant looks numeric-ish" to assist in type
resolution heuristics?  Otherwise the value could be left in string
form till later.

Is there any value in marking the constant as a numeric token, yet
leaving its specific value as a string until after type resolution
is done?
        regards, tom lane


Re: [HACKERS] numeric data type on 6.5

From
Thomas Lockhart
Date:
> > Seems to do OK with numeric tokens of unspecified type which will
> > become int8 and numeric in the parser. There may be some edge-effect
> > cases (e.g. decimal data with 17 characters) which aren't quite right.
> > Comments?
> I'd suggest backing off one more place on the length of string you will
> try to convert to a float8.  Since the test is strlen() <= 17, you
> actually can have at most 16 digits (there must be a decimal point in
> there too).  But IEEE float is only good to 16-and-change digits; I'm
> not sure I'd want to assume that the 16th digit will always be
> reproduced exactly.  15 digits would be safer.

Yeah. I'd chosen 17 to get sign+decimal+15digits...

> It could still break if the C library's float<=>string conversion
> routines are sloppy :-(.  I suppose you're interested in preserving
> the info that "this constant looks numeric-ish" to assist in type
> resolution heuristics?  Otherwise the value could be left in string
> form till later.
> Is there any value in marking the constant as a numeric token, yet
> leaving its specific value as a string until after type resolution
> is done?

Possibly. I didn't think too hard about it, but had assumed that doing
much more than I did would propagate back into the parser, which I
didn't want to tackle this close to release.
                     - Tom

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California