Thread: Numeric with '-'
Hi, The following phenomenon was reported to pgsql-jp(ML in Japan). rest=# select -1234567890.1234567; ERROR: Unable to convert left operator '-' from type 'unknown' -1234567890.1234567 is treated as - '1234567890.1234567' as the following comment in scan.l says. /* we no longer allow unary minus in numbers.* instead we pass it separately to parser. there it gets* coerced via doNegate()-- Leon aug 20 1999*/ However doNegate() does nothing for SCONST('1234567890.1234567'). I don't understand where or how to combine '-' and numeric SCONST. Regards. Hiroshi Inoue Inoue@tpf.co.jp
A strange thing I noticed with this is that "select -234567890.1234567;" works and "select -1234567890.123456;" also works while "select -1234567890.1234567;" does not. That extra character just seems to push things over the edge. It almost seems like there is some sort of length restriction somewhere in the parser. On Mon, Feb 21, 2000 at 04:06:07PM +0900, Hiroshi Inoue wrote: > Hi, > > The following phenomenon was reported to pgsql-jp(ML in Japan). > > rest=# select -1234567890.1234567; > ERROR: Unable to convert left operator '-' from type 'unknown' > > -1234567890.1234567 is treated as - '1234567890.1234567' > as the following comment in scan.l says. > > /* we no longer allow unary minus in numbers. > * instead we pass it separately to parser. there it gets > * coerced via doNegate() -- Leon aug 20 1999 > */ > > However doNegate() does nothing for SCONST('1234567890.1234567'). > I don't understand where or how to combine '-' and numeric SCONST. > > Regards. > > Hiroshi Inoue > Inoue@tpf.co.jp > > ************ -- The world's most ambitious and comprehensive PC game database project. http://www.mobygames.com
> -----Original Message----- > From: Brian Hirt [mailto:bhirt@mobygames.com] > > A strange thing I noticed with this is that > > "select -234567890.1234567;" works and > "select -1234567890.123456;" also works while > "select -1234567890.1234567;" does not. That > extra character just seems to push things over > the edge. > > It almost seems like there is some sort of length > restriction somewhere in the parser. > Currently numeric constants are FLOAT8 constants if the the precision <= 17 otherwise string constants. Regards. Hiroshi Inoue Inoue@tpf.co.jp
Brian Hirt <bhirt@mobygames.com> writes: > "select -1234567890.123456;" also works while > "select -1234567890.1234567;" does not. That > extra character just seems to push things over > the edge. > It almost seems like there is some sort of length > restriction somewhere in the parser. Indeed there is, and you'll find it at src/backend/parser/scan.l line 355 (in current sources). The lexer backs off from "float constant" to "unspecified string constant" in order to avoid losing precision from conversion to float. Which is fine, except that without any cue that the constant is numeric, the parser is unable to figure out what to do with the '-' operator. I've been ranting about this in a recent pghackers thread ;-). The lexer shouldn't have to commit to a conversion to float8 in order to report that a token looks like a numeric literal. The resulting error message ERROR: Unable to convert left operator '-' from type 'unknown' isn't exactly up to a high standard of clarity either; what it really means is "unable to choose a unique left operator '-' for type 'unknown'", and it ought to suggest adding an explicit cast. I'll see what I can do about that. But the right way to fix the fundamental problem is still under debate. In the meantime you can provide the parser a clue with an explicit cast: play=> select -1234567890.1234567::numeric; ?column? ----------------- -1234567890.12346 (1 row) This still seems a little broken though, since it looks like the constant's precision is getting truncated to 15 digits; presumably there's a coercion to float happening in there somewhere, but I don't understand where at the moment... A few minutes later: yes I do: there's no unary minus operator defined for type numeric, so the parser does the best it can by applying float8um instead. Jan? regards, tom lane
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > The following phenomenon was reported to pgsql-jp(ML in Japan). > rest=# select -1234567890.1234567; > ERROR: Unable to convert left operator '-' from type 'unknown' I've committed fixes that make the parser treat numeric literals the same no matter how many digits they have. With current sources, regression=# select -1234567890.1234567; ?column? --------------------1234567890.12346 (1 row) which is probably still not what you want, because the default type for a non-integer literal is float8 in the absence of any context to clue the system otherwise, so you lose precision. You can do regression=# select -1234567890.12345678900::numeric; ?column? --------------------------1234567890.12345678900 (1 row) but in reality that's only working because of the way that doNegate works on literals; since there is no unary minus operator for NUMERIC, a minus on a non-constant value is going to be coerced to float8: regression=# select -val from num_data; ?column? ------------------ 0 0 34338492.215397 -4.31 -7799461.4119 -16397.038491 -93901.57763026 83028485 -7488124926804.0450474 (10 rows) whereas this works right: regression=# select 0-val from num_data; ?column? --------------------- 0.0000000000 0.000000000034338492.2153970470 -4.3100000000-7799461.4119000000 -16397.0384910000 -93901.577630260083028485.0000000000 -74881.000000000024926804.0450474200 (10 rows) Somebody ought to write a NUMERIC unary minus... regards, tom lane
On 2000-02-21, Tom Lane mentioned: > I've been ranting about this in a recent pghackers thread ;-). > The lexer shouldn't have to commit to a conversion to float8 > in order to report that a token looks like a numeric literal. Has the ranting resulted in any idea yet? ISTM that keeping a non-integer number as a string all the way to the executor shouldn't hurt too much. After all, according to SQL 123.45 *is* a NUMERIC literal! By making it a float we're making our users liable to breaking all kinds of fiscal regulations in some places. (Ask Jan.) > The resulting error message > ERROR: Unable to convert left operator '-' from type 'unknown' > isn't exactly up to a high standard of clarity either; Speaking of 'unknown', this is my favourite brain-damaged query of all times: peter=> select 'a' like 'a'; ERROR: Unable to identify an operator '~~' for types 'unknown' and 'unknown' You will have to retype this query usingan explicit cast Is there a good reason that a character literal is unknown? I'm sure the reasons lie somewhere in the extensible type system, but if I wanted it to be something else explicitly then I would have written DATE 'yesterday'. -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
At 12:57 AM 2/22/00 +0100, Peter Eisentraut wrote: >Has the ranting resulted in any idea yet? ISTM that keeping a non-integer >number as a string all the way to the executor shouldn't hurt too much. >After all, according to SQL 123.45 *is* a NUMERIC literal! By making it a >float we're making our users liable to breaking all kinds of fiscal >regulations in some places. (Ask Jan.) Certainly there was a time in the past, at least, where cross-compilers frequently did something along these lines, if they were designed to support a variety of target architectures. Not so common now in the compiler world since typically host and target both support IEEE standard floating point operations, but 'twas so back in the days before the standard existed and before hardware implementations proliferated. It wouldn't impact the performance of query parsing and analysis noticably. You have to take care when (for instance) folding operations on constants - I suspect that somewhere in the 50K lines of the SQL92 draft or the 83K lines of the SQL3 draft precise rules for such things are laid down. Though probably in an incomprehensible fashion! >Speaking of 'unknown', this is my favourite brain-damaged query of all >times: > >peter=> select 'a' like 'a'; >ERROR: Unable to identify an operator '~~' for types 'unknown' and 'unknown' > You will have to retype this query using an explicit cast That *is* very cool! :) Postgres is an amazing beast at times! - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > > The following phenomenon was reported to pgsql-jp(ML in Japan). > > > rest=# select -1234567890.1234567; > > ERROR: Unable to convert left operator '-' from type 'unknown' > > I've committed fixes that make the parser treat numeric literals > the same no matter how many digits they have. With current sources, > > regression=# select -1234567890.1234567; > ?column? > ------------------- > -1234567890.12346 > (1 row) > > which is probably still not what you want, Hmm,this may be worse than before. INSERT/UPDATE statements would lose precision without telling any error/warnings. > because the default > type for a non-integer literal is float8 in the absence of any > context to clue the system otherwise, so you lose precision. > You can do > Shouldn't decimal constants be distinguished from real constants ? For example, decimal --> NCONST -> T_Numreic Value -> Const node of type NUMERICOID .... Comments ? Regards. Hiroshi Inoue Inoue@tpf.co.jp
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > Hmm,this may be worse than before. > INSERT/UPDATE statements would lose precision without > telling any error/warnings. They didn't give any such warning before, either. I doubt I've made anything worse. > Shouldn't decimal constants be distinguished from real constants ? Why? I don't see any particularly good reason for distinguishing 1234567890.1234567890 from 1.2345678901234567890e9. (numeric_in does accept both these days, BTW.) regards, tom lane
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > > Hmm,this may be worse than before. > > INSERT/UPDATE statements would lose precision without > > telling any error/warnings. > > They didn't give any such warning before, either. I doubt I've > made anything worse. > Before your change INSERT into t (numdata) values (-1234567890.1234567); caused an error ERROR: Unable to convert left operator '-' from type 'unknown'. but currently inserts a constant -1234567890.12346. and INSERT into t (numdata) values (1234567890.1234567); inserted a numeric constant 1234567890.1234567 precisely but currently inserts a constant 1234567890.12346. > > Shouldn't decimal constants be distinguished from real constants ? > > Why? I don't see any particularly good reason for distinguishing > 1234567890.1234567890 from 1.2345678901234567890e9. (numeric_in > does accept both these days, BTW.) > According to a book about SQL92 which I have,SQL92 seems to recommend it. Hiroshi Inoue Inoue@tpf.co.jp
Peter Eisentraut <peter_e@gmx.net> writes: > On 2000-02-21, Tom Lane mentioned: >> I've been ranting about this in a recent pghackers thread ;-). >> The lexer shouldn't have to commit to a conversion to float8 >> in order to report that a token looks like a numeric literal. > Has the ranting resulted in any idea yet? ISTM that keeping a non-integer > number as a string all the way to the executor shouldn't hurt too much. Well, actually it's sufficient to keep it as a string until the type analyzer has figured out what data type it's supposed to be; then you can feed it to that type's typinput conversion routine. After that it's not the parser's problem anymore ;-). I committed changes to do exactly that this morning. Thomas had been saying that integer literals should be kept as strings too, but I don't believe that and didn't do it. > peter=> select 'a' like 'a'; > ERROR: Unable to identify an operator '~~' for types 'unknown' and 'unknown' > You will have to retype this query using an explicit cast > Is there a good reason that a character literal is unknown? I'm sure the > reasons lie somewhere in the extensible type system, but if I wanted it to > be something else explicitly then I would have written DATE 'yesterday'. Remember that constants of random types like "line segment" have to start out as character literals (unless you want to try to pass them through the lexer and parser undamaged without quotes). So untyped character literal has to be a pretty generic thing. It might be a good idea for the type analyzer to try again with the assumption that the literal is supposed to be type text, if it fails to find an interpretation without that assumption --- but I think this is a ticklish change that could have unwanted consequences. It'd need some close examination. regards, tom lane
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes: >> They didn't give any such warning before, either. I doubt I've >> made anything worse. > Before your change > INSERT into t (numdata) values (-1234567890.1234567); > caused an error > ERROR: Unable to convert left operator '-' from type 'unknown'. > but currently inserts a constant -1234567890.12346. Yipes, you are right. I think that that sort of construct should result in the value not getting converted at all until the parser knows that it must be converted to the destination column's type. Let me see if I can find out what's going wrong. If this doesn't seem to be fixable, I may have to back off the patch... regards, tom lane
On 2000-02-21, Tom Lane mentioned: > > Is there a good reason that a character literal is unknown? I'm sure the > > reasons lie somewhere in the extensible type system, but if I wanted it to > > be something else explicitly then I would have written DATE 'yesterday'. > > Remember that constants of random types like "line segment" have to > start out as character literals A constant of type line segment looks like this: LSEG 'whatever' This is an obvious extension of the standard. (Also note that this is *not* a cast.) The semantics of SQL throughout are that if I write something of the form quote-characters-quote, it's a character literal. No questions asked. Now if I pass a character literal to a datetimeish function, it's on obvious cast. If I pass it to a geometry function, it's an obvious cast. If I pass it to a generic function, it's a character string. It seems that for the benefit of a small crowd -- those actually using geometric types and being too lazy to type their literals in the above manner -- we are creating all sorts of problems for two much larger crowds: those trying to use their databases in an normal manner with strings and numbers, and those trying develop for this database that never know what type a literal is, when it should be obvious. I am definitely for a close examination of this one. -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Peter Eisentraut <peter_e@gmx.net> writes: >> Remember that constants of random types like "line segment" have to >> start out as character literals > A constant of type line segment looks like this: > LSEG 'whatever' > This is an obvious extension of the standard. (Also note that this is > *not* a cast.) Yes it is. On what grounds would you assert that it isn't? Certainly not on the basis of what comes out of gram.y; all three of these produce exactly the same parsetree:LSEG 'whatever''whatever'::LSEGCAST('whatever' AS LSEG) > It seems that for the benefit of a small crowd -- those actually using > geometric types and being too lazy to type their literals in the above > manner -- we are creating all sorts of problems for two much larger > crowds Au contraire. The real issue here is how to decide which numeric type to use for an undecorated but numeric-looking literal token. I don't think that's a non-mainstream problem, and I definitely don't think that telling the odd-datatype crowd to take a hike will help fix it. regards, tom lane
On Wed, 23 Feb 2000, Tom Lane wrote: > Au contraire. The real issue here is how to decide which numeric type > to use for an undecorated but numeric-looking literal token. I don't You lost me. How does that relate to the character types? You are not suggesting that '123.456' should be considered a number? It seems pretty clear to me that anything of the form [0-9]+ is an integer, something with an 'e' in it is a float, and something with only digits and decimal points is numeric. If passing around an 'numeric' object is too expensive, keep it as a string for a while longer. As you did. > think that's a non-mainstream problem, and I definitely don't think > that telling the odd-datatype crowd to take a hike will help fix it. It remains to be shown how big that "hike", if at all existent, would be ... -- Peter Eisentraut Sernanders vaeg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Peter Eisentraut <e99re41@DoCS.UU.SE> writes: > ... It seems pretty > clear to me that anything of the form [0-9]+ is an integer, something with > an 'e' in it is a float, and something with only digits and decimal points > is numeric. So 123456789012345678901234567890 is an integer? Not on the machines I use. Nor do I want to restrict 1234567890.1234567890e20 or 1e500 to be considered always and only floats; the first will drop precision and the second will overflow, whereas they are both perfectly useful as numeric. What I'd originally hoped was that we could postpone determining the type of a numeric literal until we saw where it was going to be used, as in Hiroshi's INSERT into t (numdata) values (-1234567890.1234567); example. Unfortunately that doesn't work in some other fairly obvious cases, like SELECT 1.2 + 3.4; you just plain don't have any other cues except the sizes and precisions of the constants to resolve the type here. So the original code was right, I think, to the extent that it looked at the precision and size of the constant to select a default type for the constant. But it wasn't right to lose the numeric-ness of the constant altogether when it doesn't fit in a double. What I'm testing now is code that generates either INT4, FLOAT8, or NUMERIC depending on precision and size --- but never UNKNOWN, which is what you'd get before with more than 17 digits. regards, tom lane
> -----Original Message----- > From: owner-pgsql-hackers@postgreSQL.org > [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Tom Lane > > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > >> They didn't give any such warning before, either. I doubt I've > >> made anything worse. > > > Before your change > > INSERT into t (numdata) values (-1234567890.1234567); > > caused an error > > ERROR: Unable to convert left operator '-' from type 'unknown'. > > but currently inserts a constant -1234567890.12346. > > Yipes, you are right. I think that that sort of construct should > result in the value not getting converted at all until the parser > knows that it must be converted to the destination column's type. > Let me see if I can find out what's going wrong. If this doesn't > seem to be fixable, I may have to back off the patch... > This seems to be fixed. Thanks a lot. However there still remains the following case. select * from num_data where val = 1.1; ERROR: Unable to identify an operator '=' for types 'numeric' and 'float8' You will have to retype this query usingan explicit cast SQL standard seems to say 1.1 is a numeric constant and it's not good to treat a numeric value as an aproximate value. For example,what do you think about the following. select 11111111111111 * 1.1; ?column? ------------------12222222222222.1 (1 row) select 111111111111111 * 1.1; ?column? -----------------122222222222222 (1 row) select 100000000 + .000001; ?column? ------------------100000000.000001 (1 row) select 100000000 + .0000001;?column? -----------100000000 (1 row) select 100000000.0000001; ?column? -------------------100000000.0000001 (1 row) Regards. Hiroshi Inoue Inoue@tpf.co.jp
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > However there still remains the following case. > select * from num_data where val = 1.1; > ERROR: Unable to identify an operator '=' for types 'numeric' and 'float8' > You will have to retype this query using an explicit cast Yeah. I'm not sure that that can be fixed without a major redesign of the type-conversion hierarchy, which is not something I care to try during beta ;-). In fact, it's arguable that the system is doing the right thing by forcing the user to specify whether he wants a NUMERIC or FLOAT8 comparison to be used. There are other examples where we *must* refuse to decide. For example: regression=# create table ff (f1 char(8), f2 varchar(20)); CREATE regression=# select * from ff where f1 = f2; ERROR: Unable to identify an operator '=' for types 'bpchar' and 'varchar' You will have to retype this query usingan explicit cast This is absolutely the right thing, because bpchar and varchar do not have the same comparison semantics (trailing blanks are significant in one case and not in the other), so the user has to tell us which he wants. > SQL standard seems to say 1.1 is a numeric constant and > it's not good to treat a numeric value as an aproximate value. > For example,what do you think about the following. That argument is untenable. NUMERIC has limitations just as bad as FLOAT's; they're merely different. For example: regression=# select 1.0/300000.0; ?column? ----------------------3.33333333333333e-06 (1 row) regression=# select 1.0::numeric / 300000.0::numeric; ?column? --------------0.0000033333 (1 row) Notice the completely unacceptable loss of precision ;-) in the second case. When you look at simple cases like "var = constant" it seems easy to say that the system should just do the right thing, but in more complex cases it's not always easy to know what the right thing is. I think what you are proposing is to change the system's default assumption about decimal constants from float8 to numeric. I think that's a very risky change that is likely to break existing applications (and if we throw in automatic conversions, it'll break 'em silently). I'm not eager to do that. regards, tom lane