Thread: column "b" is of type X but expression is of type text
A third party application we use generates SQL queries. Here is query it generated that broke today and for which I have a hard time arguing that the postgres behavior is correct (minimally the error message is confusing):
=# create temporary table foo (b double precision );
CREATE TABLE
Time: 40.368 ms
=# insert into foo select min(NULL);
ERROR: column "b" is of type double precision but expression is of type text
LINE 1: insert into foo select min(NULL);
^
HINT: You will need to rewrite or cast the expression.
So why does min(NULL) have type text? According to the docs it has the type of the input. The value is itself NULL which is a valid member of all types in SQL isn't it?=# create temporary table foo (b double precision );
CREATE TABLE
Time: 40.368 ms
=# insert into foo select min(NULL);
ERROR: column "b" is of type double precision but expression is of type text
LINE 1: insert into foo select min(NULL);
^
HINT: You will need to rewrite or cast the expression.
Thanks,
Bene
Benedikt Grundmann wrote > A third party application we use generates SQL queries. Here is query it > generated that broke today and for which I have a hard time arguing that > the postgres behavior is correct (minimally the error message is > confusing): > > =# create temporary table foo (b double precision ); > CREATE TABLE > Time: 40.368 ms > =# insert into foo select min(NULL); > ERROR: column "b" is of type double precision but expression is of type > text > LINE 1: insert into foo select min(NULL); > ^ > HINT: You will need to rewrite or cast the expression. > > So why does min(NULL) have type text? According to the docs it has the > type of the input. The value is itself NULL which is a valid member of > all > types in SQL isn't it? > > So what is going on? > > Thanks, > > Bene Ideally PostgreSQL would be smart enough to recognize that "min(NULL)" is of an unknown type and thus would use the definition of "foo" to coerce NULL to the desired type. I cannot explain why it does not do this but from the example it cannot. Using a literal NULL without an explicit type-cast is not recommended as the system cannot always accurately figure out what type you mean for it to use. Being a valid value for all types does not mean it magically switches to fit whatever usage is required. Columns are typed, not values per-se, and so NULL can belong in any column but once it is part of that column it takes on that column's type. The query you show is pretty pointless since the intent of "min" is to take a column over which to aggregate; not a literal which will only ever return itself. In short the SELECT query is trying its best to execute and so in the presence of an unadorned NULL - and being unable to infer the type from context - it simply uses the default type which is "text". The SELECT executes just fine, and outputs a "min" column of type "text" which when supplied to the table "foo" causes the type mis-match for column "b" on "foo". The PostgreSQL behavior is "simple" because it does not infer the type of NULL from the column in foo but it is not required to do so its failure is not wrong. The error message, given what does occur, makes perfect sense and is easy enough to trace (i.e., what column is feeding foo.b from the SELECT statement; then, why is that column being seen as "text"). PostgreSQL is in the opinion of some too verbose in its requirement to be explicit regarding types but it does make for less buggy code overall. This particular use-case may be solvable but I'd argue that your example is not likely to convince anyone that it is a serious enough problem worth the effort it would take to do so. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/column-b-is-of-type-X-but-expression-is-of-type-text-tp5763586p5763587.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
On Fri, Jul 12, 2013 at 8:47 AM, Benedikt Grundmann <bgrundmann@janestreet.com> wrote: > A third party application we use generates SQL queries. Here is query it > generated that broke today and for which I have a hard time arguing that the > postgres behavior is correct (minimally the error message is confusing): > > =# create temporary table foo (b double precision ); > CREATE TABLE > Time: 40.368 ms > =# insert into foo select min(NULL); > ERROR: column "b" is of type double precision but expression is of type > text > LINE 1: insert into foo select min(NULL); > ^ > HINT: You will need to rewrite or cast the expression. > > So why does min(NULL) have type text? According to the docs it has the type > of the input. The value is itself NULL which is a valid member of all types > in SQL isn't it? > > So what is going on? This is not a question for -hackers. Postgres is strictly typed -- there is no variant type. So even though some functions can configured to support multiple input types via overloading, 'any' arguments, etc. when stuff actually gets done type coercion has to take place and text is chosen as a type of last resort. merlin
Thanks David,
I like the fact that postgres is explicit in it's types. All I'm arguing is that error message is misleading. And that I had a hard time understanding why happened what happened. The part I was missing is that despite supporting an any type the necessary type inference is very very local and quickly resorts to the default type.
thanks everyone,I like the fact that postgres is explicit in it's types. All I'm arguing is that error message is misleading. And that I had a hard time understanding why happened what happened. The part I was missing is that despite supporting an any type the necessary type inference is very very local and quickly resorts to the default type.
Bene
On Fri, Jul 12, 2013 at 3:17 PM, David Johnston <polobo@yahoo.com> wrote:
Benedikt Grundmann wroteIdeally PostgreSQL would be smart enough to recognize that "min(NULL)" is of> A third party application we use generates SQL queries. Here is query it
> generated that broke today and for which I have a hard time arguing that
> the postgres behavior is correct (minimally the error message is
> confusing):
>
> =# create temporary table foo (b double precision );
> CREATE TABLE
> Time: 40.368 ms
> =# insert into foo select min(NULL);
> ERROR: column "b" is of type double precision but expression is of type
> text
> LINE 1: insert into foo select min(NULL);
> ^
> HINT: You will need to rewrite or cast the expression.
>
> So why does min(NULL) have type text? According to the docs it has the
> type of the input. The value is itself NULL which is a valid member of
> all
> types in SQL isn't it?
>
> So what is going on?
>
> Thanks,
>
> Bene
an unknown type and thus would use the definition of "foo" to coerce NULL to
the desired type. I cannot explain why it does not do this but from the
example it cannot.
Using a literal NULL without an explicit type-cast is not recommended as the
system cannot always accurately figure out what type you mean for it to use.
Being a valid value for all types does not mean it magically switches to fit
whatever usage is required. Columns are typed, not values per-se, and so
NULL can belong in any column but once it is part of that column it takes on
that column's type.
The query you show is pretty pointless since the intent of "min" is to take
a column over which to aggregate; not a literal which will only ever return
itself.
In short the SELECT query is trying its best to execute and so in the
presence of an unadorned NULL - and being unable to infer the type from
context - it simply uses the default type which is "text". The SELECT
executes just fine, and outputs a "min" column of type "text" which when
supplied to the table "foo" causes the type mis-match for column "b" on
"foo".
The PostgreSQL behavior is "simple" because it does not infer the type of
NULL from the column in foo but it is not required to do so its failure is
not wrong. The error message, given what does occur, makes perfect sense
and is easy enough to trace (i.e., what column is feeding foo.b from the
SELECT statement; then, why is that column being seen as "text").
PostgreSQL is in the opinion of some too verbose in its requirement to be
explicit regarding types but it does make for less buggy code overall. This
particular use-case may be solvable but I'd argue that your example is not
likely to convince anyone that it is a serious enough problem worth the
effort it would take to do so.
David J.
--
View this message in context: http://postgresql.1045698.n5.nabble.com/column-b-is-of-type-X-but-expression-is-of-type-text-tp5763586p5763587.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 07/12/2013 07:28 AM, Benedikt Grundmann wrote: > Thanks David, > > I like the fact that postgres is explicit in it's types. All I'm arguing > is that error message is misleading. And that I had a hard time > understanding why happened what happened. The part I was missing is that > despite supporting an any type the necessary type inference is very very > local and quickly resorts to the default type. No argument that it would be nice to have a more apropos error message.However, that's harder to achieve than you realize. Here's a simplified version what happens: 1. you hand PostgreSQL an unadorned NULL. It realizes it doesn't have a type, and makes it temporarily the default type (text) in hopes that the next stage will provide a type. 2. you call min(). Min() works for many datatypes. Min() says: "can I work for text?" The answer is "yes", so at this point the NULL which was "default text" becomes *really* text. 3. you try to assign the result of MIN() to a column of type "double". This is when the error is encountered. The planner/executor doesn't know that the reason min() is emitting text is because you handed it an unadorned NULL; it just knows that it was expecting a double, and it got text. At this point, it can't tell the difference between min(NULL) and min('Josh'::TEXT). To get a better error message, the query engine would need to reach back to step (1) when it encounters the error at step (3). The alternative would be to disallow unadorned NULLs entirely, which would break thousands of applications. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Josh Berkus wrote > On 07/12/2013 07:28 AM, Benedikt Grundmann wrote: >> Thanks David, >> >> I like the fact that postgres is explicit in it's types. All I'm arguing >> is that error message is misleading. And that I had a hard time >> understanding why happened what happened. The part I was missing is that >> despite supporting an any type the necessary type inference is very very >> local and quickly resorts to the default type. > > 2. you call min(). Min() works for many datatypes. Min() says: "can I > work for text?" The answer is "yes", so at this point the NULL which > was "default text" becomes *really* text. > > . > . > . > > The alternative would be to disallow unadorned NULLs entirely, which > would break thousands of applications. In the absence of the function call the system is able to delay resolving the type until later in the query: SELECT * FROM (VALUES ('2013-02-01'::date), ('2013-01-01'), (NULL) ) vals (col1); --works SELECT * FROM (VALUES ('2013-02-01'::date), ('2013-01-01'), (min(NULL)) ) vals (col1); --fails I have no idea how this mechanism works but ISTM that the planner could, for "anyelement", look at where the result of the function call is used and add a cast to the function input value to match the desired result type if the input type is "undefined". I'm curious what you would consider to be a "more apropos error message" in this situation; regardless of how difficult it would be to implement. I am also curious if you can think of a better example of where this behavior is problematic. The query for this thread is not something that I would deem to be good SQL. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/column-b-is-of-type-X-but-expression-is-of-type-text-tp5763586p5763615.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
David, > I have no idea how this mechanism works but ISTM that the planner could, for > "anyelement", look at where the result of the function call is used and add > a cast to the function input value to match the desired result type if the > input type is "undefined". Well, that's not how "anyelement" works, actually. And the input type for min() is not "anyelement". > I'm curious what you would consider to be a "more apropos error message" in > this situation; regardless of how difficult it would be to implement. "ERROR: unable to determine appropriate type for 'NULL'" But again, don't hold your breath, per above. > I am also curious if you can think of a better example of where this > behavior is problematic. The query for this thread is not something that I > would deem to be good SQL. Yeah, but it gets generated a lot. And per your other example, sometimes it *does* work, so developers/ORM authors start to rely on it.And then it breaks. Mostly the problematic cases are involving function parameters, where adding a new version of a function can suddently cause a call with an unadorned NULL to break, when it used to work. For example, suppose I have only one function "dingbat" dingbat( timestamptz, text, text, float ) I can easily call it with: SELECT dingbat( '2013-01-01', 'Josh', 'pgsql-hackers', NULL ) But if someone else adds a second function, possibly due to a typo with the version control system: dingbat(timestamptz, text, text, text) ... then the above SELECT call will automatically choose the second function, because NULL defaults to TEXT if unadorned. Among other things, that could make a fun exploit if people have been careless with their SECURITY DEFINER functions. A worse example is the CIText type. A couple versions ago, I attempted to force default case-insensitive comparisons for: 'val'::CITEXT = 'val'::TEXT ... which is what the user would intuitively believe would happen, instead of the case-sensitive comparison, which is what *does* happen. After a long weekend of messy bug-hunting and breaking built-in postgresql functions, I gave up. The root cause of this is that we treat "default TEXT" the same as "real TEXT" as a type. Changing that logic, though, would require a massive refactoring and debugging of PostgreSQL. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: > The root cause of this is that we treat "default TEXT" the same as "real > TEXT" as a type. No, we do not do that at all. A NULL is initially of type unknown, and that is definitely not the same as text. The type resolution rules treat the two cases differently. The real cause of what David is complaining about is that we resolve expression datatypes bottom up. Once we've determined that we're going to consider foo(NULL) as an invocation of foo(text), that's what it is, and the context won't cause us to go back and change that. > Changing that logic, though, would require a massive > refactoring and debugging of PostgreSQL. This is true enough; and you forgot to mention all the existing applications that would also need changes if we changed the expression resolution rules. We could possibly make marginal changes without too much pain, but making function resolution context-dependent would hardly be a marginal change. regards, tom lane