Thread: Automatic type conversion
I've committed changes to allow more automatic type conversion. Lots of files were touched, mostly in backend/parser/. The new code tries to do the right thing for conversions, and does handle cases which were problematic before: -- there isn't a floating point factorial operator... tgl=> select (4.3 !); ?column? -------- 24 (1 row) -- there isn't an integer exponentiation operator... tgl=> select 2 ^ 3; ?column? -------- 8 (1 row) -- concatenation on unspecified types didn't used to work... tgl=> select '123' || '456'; ?column? -------- 123456 (1 row) -- didn't used to correctly truncate strings into tables... tgl=> create table cc (c char(4)); CREATE tgl=> insert into cc select '123' || '456'; INSERT 268073 1 tgl=> select * from cc; c ---- 1234 (1 row) So, it should fix longstanding issues. However, the main goal should be that it doesn't do the WRONG thing at any time. So, test away and post any problems or issues that come up; we have lots of time to fix things before v6.4. One change in behavior is that I defined (for builtin types) the concept of a "preferred type" in each category/class of types (e.g. float8 is the preferred type for numerics, datetime is the preferred type for date/times, etc.). And, unspecified types are preferentially resolved to use this preferred type. So, the following behavior has changed: -- this is now done as a float8 calculation, used to be float4... tgl=> select '123.456'::float4 * '1.99999999999'; ?column? ---------------- 246.912002562242 (1 row) Before, unknown types, such as the second string above, were resolved to be the same type as the other type, if available. So the calculation would have been truncated at ~7 decimal places. The good thing about this is that the behavior of the above is now the same as if the second string was specified without the quotes: tgl=> select '123.456'::float4 * 1.99999999999; ?column? ---------------- 246.912002562242 (1 row) where before it was evaluated differently in the two cases. Anyway, try things out, and I'll be writing this up for the docs. Will post the topics on hackers along the way... I haven't yet changed the regression tests to reflect the new behavior, just in case it needs to be different. Also, all regression tests pass with the only differences as mentioned above. btw, the code still has lots of cleanup needed, moving subroutines around and taking out defunct code. But will do that later. Have fun. - Tom
> I've committed changes to allow more automatic type conversion. btw, this requires a dump/reload... - Tom
On Sun, May 10, 1998 at 12:14:11AM +0000, Thomas G. Lockhart wrote: > -- there isn't a floating point factorial operator... > tgl=> select (4.3 !); > ?column? > -------- > 24 > (1 row) Am I the only one that thinks the above is wrong? 4.3 factorial is mathematically undefined and does NOT equal 24. I don't think the automatic type conversion should automatically truncate values without at least a warning. Preferably I'd like to be forced to do the conversion myself for cases like the above. -- Dave Chapeskie <dchapes@ddm.on.ca>, DDM Consulting
Dave Chapeskie wrote: > > On Sun, May 10, 1998 at 12:14:11AM +0000, Thomas G. Lockhart wrote: > > -- there isn't a floating point factorial operator... > > tgl=> select (4.3 !); > > ?column? > > -------- > > 24 > > (1 row) > > Am I the only one that thinks the above is wrong? 4.3 factorial is > mathematically undefined and does NOT equal 24. Just put the gamma function in there and assume the argument is always a float. A decent gamma function algorithm should make a special case for integers. --Gene
> > -- there isn't a floating point factorial operator... > > tgl=> select (4.3 !); > > ?column? > > -------- > > 24 > > (1 row) > > Am I the only one that thinks the above is wrong? 4.3 factorial is > mathematically undefined and does NOT equal 24. > > I don't think the automatic type conversion should automatically > truncate values without at least a warning. Preferably I'd like to be > forced to do the conversion myself for cases like the above. Yes, I included this one to provoke discussion :) Postgres has type extensibility, so the algorithms for matching up types and functions need to be very general. In this case, there is only one function defined for factorial, and it takes an integer argument. But of course Postgres now says "ah! I know how to make an int from a float!" and goes ahead and does it. If there were more than one function defined for factorial, and if none of the arguments matched a float, then Postgres would conclude that there are too many functions to choose from and throw an error. One way to address this is to never allow Postgres to "demote" a type; i.e. Postgres would be allowed to promote arguments to a "higher" type (e.g. int->float) but never allowed to demote arguments (e.g. float->int). But this would severely restrict type matching. I wanted to try the more flexible case first to see whether it really does the "wrong thing"; in the case of factorial, the only recourse for someone wanting to calculate a factorial from a float is to convert to an int first anyway. Or, again for this factorial case, we can implement a floating point factorial with either the gamma function (whatever that is :) or with an explicit routine which checks for non-integral values. Could also print a notice when arguments are being converted, but that might get annoying for most cases which are probably trivial ones. - Tom
Tom: > One way to address this is to never allow Postgres to "demote" a type; > i.e. Postgres would be allowed to promote arguments to a "higher" type > (e.g. int->float) but never allowed to demote arguments (e.g. > float->int). But this would severely restrict type matching. I wanted to > try the more flexible case first to see whether it really does the > "wrong thing"; in the case of factorial, the only recourse for someone > wanting to calculate a factorial from a float is to convert to an int > first anyway. I think that never demoting is the best way to procede here. If the type resolution search is too "thorough" it can be very confusing (see c++ for example). As it is the interaction of SQL and the type system can create surprises. Promoting both "up" and "down" is likely to make it very hard to figure out what any given query will do. -dg David Gould dg@illustra.com 510.628.3783 or 510.305.9468 Informix Software (No, really) 300 Lakeside Drive Oakland, CA 94612 "Of course, someone who knows more about this will correct me if I'm wrong, and someone who knows less will correct me if I'm right." --David Palmer (palmer@tybalt.caltech.edu)