Thread: BUG #5225: create table: cast necessary for constant??
The following bug has been logged online: Bug reference: 5225 Logged by: Kurt wagner Email address: kurt.wagnerextern@leoni.com PostgreSQL version: 8.41 Operating system: HP-UX Description: create table: cast necessary for constant?? Details: During migration from Informix to Postgres I came across following issue: create temp table temp1 as SELECT firmnr, werknr, 'I' as invper, invnum from .... ; the next select on table temp1 returned the error ERROR: failed to find conversion function from unknown to character [SQL State=XX000] I could find out that the column invper in the temp. table temp1 was defined as data type "unknown". To me it is a bug because the column is filled by a constant. So regardless if you estimate it as char, varchar, or text it is still processable whereas "unknown" is the worst case and returns an error as described above. other DBs (e.g. db2, or Informix) can handle such cases correctly. is it possible to fix it, please?
"Kurt wagner" <kurt.wagnerextern@leoni.com> writes: > During migration from Informix to Postgres I came across following issue: > create temp table temp1 as > SELECT firmnr, > werknr, > 'I' as invper, > invnum > from .... ; You really ought to cast the 'I' to some specific type. The above code is illegal per SQL standard. You can get away with it in certain contexts in Postgres, but when you are creating a table or view and don't know exactly what might be done with the column, it's much better to ensure it's of the intended type. We aren't going to make it default to text or whatever because it's not always clear that that's what's intended --- consider '1.2' as invper, '2009-11-23' as invper, Most likely text was not what the writer had in mind in such cases. regards, tom lane
On 3/12/2009 12:35 AM, Tom Lane wrote: > "Kurt wagner"<kurt.wagnerextern@leoni.com> writes: >> During migration from Informix to Postgres I came across following issue: >> create temp table temp1 as >> SELECT firmnr, >> werknr, >> 'I' as invper, >> invnum >> from .... ; > > You really ought to cast the 'I' to some specific type. It's usually neatest to do this by just explicitly identifying the intended type in the first place, eg: SELECT firmnr, werknr, TEXT 'I' as invper, invnum from .... ; ... which, IIRC, is the standard way to do it. I don't have a copy to check against to be sure. Personally, I like the fact that Pg errs on the side of caution here rather than guessing what you want. -- Craig Ringer
Craig Ringer <craig@postnewspapers.com.au> wrote: > On 3/12/2009 12:35 AM, Tom Lane wrote: >> You really ought to cast the 'I' to some specific type. > > It's usually neatest to do this by just explicitly identifying > the intended type in the first place, eg: > > > SELECT firmnr, > werknr, > TEXT 'I' as invper, > invnum > from .... ; > > ... which, IIRC, is the standard way to do it. I don't have a copy > to check against to be sure. > > Personally, I like the fact that Pg errs on the side of caution > here rather than guessing what you want. We should probably have some wiki page or something to which we can refer people when they raise this, which is bound to happen from time to time, since the PostgreSQL behavior is a deviation from the standard. Now, I've been persuaded that there are good reasons for the deviation, and that workarounds for code previously written to standard are relatively straightforward, but many people here lose sight of the fact that it *is* a deviation when replying to someone who's just run into it. Quoting from section 5.3 of "WG3:HBA-003 H2-2003-305 August, 2003 (ISO-ANSI Working Draft) Foundation (SQL/Foundation)": | 13) The declared type of a <character string literal> is | fixed-length character string. The length of a <character | string literal> is the number of <character representation>s | that it contains. Each <quote symbol> contained in <character | string literal> represents a single <quote> in both the value | and the length of the <character string literal>. The two | <quote>s contained in a <quote symbol> shall not be separated | by any <separator>. | | NOTE 72 * <character string literal>s are allowed to be | zero-length strings (i.e., to contain no characters) even | though it is not permitted to declare a <data type> that is | CHARACTER with <length> 0 (zero). Treating an otherwise unadorned set of characters between two apostrophes as anything except a character string literal of type CHARACTER with a length calculated per the above violates the standard. Rather than pretending otherwise, we should be prepared to explain the reasons for the deviation, describe what the PostgreSQL behavior *is*, and justify the deviation. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Quoting from section 5.3 of "WG3:HBA-003 H2-2003-305 August, 2003 > (ISO-ANSI Working Draft) Foundation (SQL/Foundation)": > | 13) The declared type of a <character string literal> is > | fixed-length character string. > Treating an otherwise unadorned set of characters between two > apostrophes as anything except a character string literal of type > CHARACTER with a length calculated per the above violates the > standard. Rather than pretending otherwise, we should be prepared > to explain the reasons for the deviation, describe what the > PostgreSQL behavior *is*, and justify the deviation. Sorry about that --- I had confused this case with that of a bare NULL literal, which Postgres treats the same as an unadorned string literal for type determination purposes. You're right that the spec treats them differently. This is feasible for the spec's purposes because it has such a paucity of data types. Also, I believe that the spec expects you to explicitly mark literals that aren't to be treated as plain strings, ie, in something like TIMESTAMP '2009-12-02 18:28:58' you're not really supposed to omit the word TIMESTAMP. Postgres has a whole lot of datatypes, including user-added ones, and most of them share the unadorned string literal as the base case for constants. Giving preference to CHARACTER would make that machinery a lot less pleasant to use. regards, tom lane
Thanks Tom and Kevin=20 for your detailed explanation. Even if I know now there is no chance of changing it I'd like you to consider following fact: when writing a character constant elsewhere=20 then at first it is interpreted as character constant - right? then it is casted to the desired type e.g. SELECT... FROM ... WHERE now() > '2009-12-03' at first the input is accepted as character literal (@Tom: in no way the literal is automatically interpreted as a user=20 defined data type) then the literal is casted to timestamp then it is compared. This behaviour you can see when typing a wrong timestamp: ERROR: invalid input syntax for type timestamp with time zone:=20 "2009-12#03" [SQL State=3D22007]=20 -> this is a message created by the data type casting setting the column type to unknown will deactivate all automatic type=20 casts available. For me it was not logical (but errornous) to not interpret the literal=20 as=20 please refer to following statement: CREATE TEMP TABLE blabla AS SELECT now() AS timecol, 0 AS intcol, 0.0 AS=20 deccol, 'I' AS CHARCOL FROM mytable; Postgres decided to define deccol as numeric even not knowing the exact=20 type. But setting it to numeric it allows further processing. Only=20 charcol is unknown which causes problems in further processing. you see what I mean? thank you and kind regards Kurt =20 -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us]=20 Sent: Thursday, December 03, 2009 12:40 AM To: Kevin Grittner Cc: Craig Ringer; Wagner, Kurt; pgsql-bugs@postgresql.org Subject: Re: [BUGS] BUG #5225: create table: cast necessary for=20 constant?? "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Quoting from section 5.3 of "WG3:HBA-003 H2-2003-305 August, 2003 > (ISO-ANSI Working Draft) Foundation (SQL/Foundation)": > | 13) The declared type of a <character string literal> is > | fixed-length character string. > Treating an otherwise unadorned set of characters between two > apostrophes as anything except a character string literal of type > CHARACTER with a length calculated per the above violates the > standard. Rather than pretending otherwise, we should be prepared > to explain the reasons for the deviation, describe what the > PostgreSQL behavior *is*, and justify the deviation. Sorry about that --- I had confused this case with that of a bare NULL literal, which Postgres treats the same as an unadorned string literal for type determination purposes. You're right that the spec treats them differently. This is feasible for the spec's purposes because it has such a paucity of data types. Also, I believe that the spec expects you to explicitly mark literals that aren't to be treated as plain strings, ie, in something like TIMESTAMP '2009-12-02 18:28:58' you're not really supposed to omit the word TIMESTAMP. Postgres has a whole lot of datatypes, including user-added ones, and most of them share the unadorned string literal as the base case for constants. Giving preference to CHARACTER would make that machinery a lot less pleasant to use. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Sorry about that --- I had confused this case with that of a bare > NULL literal, which Postgres treats the same as an unadorned > string literal for type determination purposes. You're right that > the spec treats them differently. This is feasible for the spec's > purposes because it has such a paucity of data types. Yeah, the arguments about how the PostgreSQL behavior makes it easier to work with user defined types are compelling. The differences in behavior don't show up often -- I suspect that they will typically be encountered by those converting from other products to PostgreSQL. I'm just suggesting that someone put together a page in the wiki or documentation to describe the differences in behavior and the normal workarounds. That might preempt some of the problems, and would be a quick way to help someone who runs into the issue for the first time. > Also, I believe that the spec expects you to explicitly mark > literals that aren't to be treated as plain strings, ie, in > something like > TIMESTAMP '2009-12-02 18:28:58' > you're not really supposed to omit the word TIMESTAMP. Absolutely true. Although many products will tolerate omission for date/time literals, that's non-standard behavior. The reason they do that is pretty much the same reason that PostgreSQL does, but PostgreSQL takes it farther. > Postgres has a whole lot of datatypes, including user-added ones, > and most of them share the unadorned string literal as the base > case for constants. Giving preference to CHARACTER would make > that machinery a lot less pleasant to use. Well put. -Kevin
"Wagner, Kurt" <kurt.wagnerextern@leoni.com> wrote: > when writing a character constant elsewhere > then at first it is interpreted as character constant - right? > then it is casted to the desired type No. It was confusing for me, too; but the PostgreSQL behavior is to treat what the standard calls a <character string literal> as being of type UNKNOWN -- just like the behavior described in the spec for NULL. When it is used in some context where the type must be resolved, it then tries to pick an appropriate type. (I believe there is some slight preference for type TEXT when there are multiple possibilities.) This is helpful for those wanting to use literals of non-standard types. (Many people use PostgreSQL specifically because of the ability to define custom types and operators.) There is an understandable tendency of those who work deep in the guts of the PostgreSQL software, making all this custom type code work, that those coming into PostgreSQL from other environments come with the assumption that these literals will be treated as character strings. From their perspective there is nothing more natural than to view such a literal as lacking any type information, with the obvious implication that you should explicitly give it a type. Once you shake out any problems from code you are migrating, you'll find it's not hard to write new code in a way which will work in either environment. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > There is an understandable tendency of those who work deep in the > guts of the PostgreSQL software, making all this custom type code > work, I mangled that sentence worse than usual. The tendency is to see the PostgreSQL behavior as natural and to forget the expectations of those coming in. I shouldn't post until the caffeine is fully in effect. -Kevin