Thread: Failure to coerce unknown type to specific type

Failure to coerce unknown type to specific type

From
Jeff Davis
Date:
Original report and patch by Karl Schnaitter.

create table a(u) as select '1';
create table b(i int);
insert into b select u from a;
ERROR:  failed to find conversion function from unknown to integer

SELECT a=b FROM (SELECT ''::text, '  ') x(a,b);
ERROR:  failed to find conversion function from unknown to text

The strange thing about these cases are that can_coerce_type returns
true, but coerce_type fails.

This can be fixed by a small change (attached) to find_coercion_pathway to add:

    else if (sourceTypeId == UNKNOWNOID)
        result = COERCION_PATH_COERCEVIAIO;

I don't see any big problem with doing this, because we've already
done the type inference; it's just a question of whether we succeed or
fail when we try to apply it. I don't see any reason to fail a cast
from unknown to something else.

However, this still doesn't fix a more complex case like:

create table tt(x text primary key);
create table ut(x unknown, foreign key(x) references tt);
insert into tt values('');
insert into ut values('');
update ut set x = x;

Regards,
     Jeff Davis

Attachment

Re: Failure to coerce unknown type to specific type

From
Tom Lane
Date:
Jeff Davis <pgsql@j-davis.com> writes:
> Original report and patch by Karl Schnaitter.
> create table a(u) as select '1';

We should, in fact, fail that to begin with.  Unknown-type columns are
a spectactularly horrid idea.

> This can be fixed by a small change (attached) to find_coercion_pathway to add:

>     else if (sourceTypeId == UNKNOWNOID)
>         result = COERCION_PATH_COERCEVIAIO;

This is not a good idea, I think.  I definitely don't accept any reasoning
that starts from the premise that UNKNOWN is a first-class type.

            regards, tom lane

Re: Failure to coerce unknown type to specific type

From
Jeff Davis
Date:
On Wed, 2015-04-08 at 21:31 -0400, Tom Lane wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
> > Original report and patch by Karl Schnaitter.
> > create table a(u) as select '1';
>
> We should, in fact, fail that to begin with.  Unknown-type columns are
> a spectactularly horrid idea.

That example was just for illustration. My other example didn't require
creating a table at all:

  SELECT a=b FROM (SELECT ''::text, '  ') x(a,b);

it's fine with me if we want that to fail, but I don't think we're
failing in the right place, or with the right error message.

I'm not clear on what rules we're applying such that the above query
should fail, but:

  SELECT ''::text='  ';

should succeed. Unknown literals are OK, but unknown column references
are not? If that's the rule, can we catch that earlier, and throw an
error like 'column reference "b" has unknown type'?

Regards,
    Jeff Davis

Re: Failure to coerce unknown type to specific type

From
Robert Haas
Date:
On Wed, Apr 8, 2015 at 9:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
>> Original report and patch by Karl Schnaitter.
>> create table a(u) as select '1';
>
> We should, in fact, fail that to begin with.  Unknown-type columns are
> a spectactularly horrid idea.

I agree.  So why don't we throw an error when someone tries to create one?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Failure to coerce unknown type to specific type

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Apr 8, 2015 at 9:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Jeff Davis <pgsql@j-davis.com> writes:
>>> Original report and patch by Karl Schnaitter.
>>> create table a(u) as select '1';

>> We should, in fact, fail that to begin with.  Unknown-type columns are
>> a spectactularly horrid idea.

> I agree.  So why don't we throw an error when someone tries to create one?

We should IMO, but there's been push-back about backwards compatibility
when this has been proposed in the past.  But I'd rather break backwards
compatibility to the extent of saying "you can't do that" than to try to
make unknown a full-fledged type, which is what this patch wants to do.

I have some recollection that we'd also put it off pending resolution of
debates about how to handle unknown-type literals in UNIONs and similar
contexts.  But poking at such examples right now, the behavior seems
generally reasonable: it seems like we resolve "unknown" as text when
forced to make a decision, but otherwise put it off as long as possible.
So that consideration may be obsolete.

An alternative design that is also worthy of consideration is to force
the created view or table column to be type text rather than unknown.
But that would be more complex to implement, and it's not obviously
superior to saying "hey bud, you need to be more specific as to what
column type you intend here".

            regards, tom lane

Re: Failure to coerce unknown type to specific type

From
Robert Haas
Date:
On Wed, Apr 29, 2015 at 6:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> We should, in fact, fail that to begin with.  Unknown-type columns are
>>> a spectactularly horrid idea.
>
>> I agree.  So why don't we throw an error when someone tries to create one?
>
> We should IMO, but there's been push-back about backwards compatibility
> when this has been proposed in the past.

I do think that there are probably people who have got cruft lying
around in their database where they've accidentally created views or
tables with unknown-type columns.  They are unlikely to be actually
used, but they may exist.  If we want to ease the
backward-compatibility pain, maybe we could map unknown -> text, so
that the upgrade still works but changes the column type under the
hood.  Or we can just break it.  But we should do something.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Failure to coerce unknown type to specific type

From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I have some recollection that we'd also put it off pending
> resolution of debates about how to handle unknown-type literals
> in UNIONs and similar contexts.  But poking at such examples
> right now, the behavior seems generally reasonable: it seems like
> we resolve "unknown" as text when forced to make a decision, but
> otherwise put it off as long as possible. So that consideration
> may be obsolete.

I recall two constructs that we had in production that caused some
pain moving to PostgreSQL.

Here's one:

test=# create table x (d date);
CREATE TABLE
test=# insert into x values (null);
INSERT 0 1
test=# insert into x values (coalesce(null, null));
ERROR:  column "d" is of type date but expression is of type text
LINE 1: insert into x values (coalesce(null, null));
^
HINT:  You will need to rewrite or cast the expression.

I know these worked in Sybase ASE, SAP DB, MySQL MaxdDB, IBM OS/2
EE's port of DB2, and early versions of MS SQL Server.  I have
confirmed (using SQL Fiddle) that it works in Oracle 11g R2, MySQL
5.5 and 5.6, and SQLite (SQL.js).  Interestingly, MS SQL Server
2014 now throws this error:

At least one of the arguments to COALESCE must be an expression that is not the NULL constant.

Here the other:

test=# select null as ts union all select null union all select now();
ERROR:  UNION types text and timestamp with time zone cannot be matched
LINE 1: ...ect null as ts union all select null union all select now();
^
test=# create table n (id int not null);
CREATE TABLE
test=# insert into n values (1);
INSERT 0 1
test=# select null as ts from n
test-# union all
test-# select null from n
test-# union all
test-# select 1 from n;
ERROR:  UNION types text and integer cannot be matched
LINE 5: select 1 from n;
^

This runs in *all* of the above environments.

I don't know of any other database product which chokes on the above.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Failure to coerce unknown type to specific type

From
Jeff Davis
Date:
On Wed, 2015-04-29 at 15:37 -0700, Tom Lane wrote:
> We should IMO, but there's been push-back about backwards compatibility
> when this has been proposed in the past.  But I'd rather break backwards
> compatibility to the extent of saying "you can't do that" than to try to
> make unknown a full-fledged type, which is what this patch wants to do.

My intention was to make can_coerce_type and coerce_type consistent --
right now, coerce_type can fail after can_coerce_type returns true.

(I also wanted to improve the composability of subqueries, but I got
enough resistance that I'm setting that argument aside.)

It really has nothing to do with creating real tables with real columns
of type unknown.

   => select u=t from (select 'x' as u, 'y'::text as t) s;
   ERROR:  failed to find conversion function from unknown to text

That's an elog (not an ereport, just plain elog) for what is a
high-level query compilation error of a fairly sane-looking query, which
seems wrong.

The code comment above the error says that the caller blew it, but as
far as I can tell there's nothing the caller can do about it. That seems
wrong, too.

Regards,
    Jeff Davis

Re: Failure to coerce unknown type to specific type

From
Robert Haas
Date:
On Fri, May 1, 2015 at 1:50 AM, Jeff Davis <pgsql@j-davis.com> wrote:
> On Wed, 2015-04-29 at 15:37 -0700, Tom Lane wrote:
>> We should IMO, but there's been push-back about backwards compatibility
>> when this has been proposed in the past.  But I'd rather break backwards
>> compatibility to the extent of saying "you can't do that" than to try to
>> make unknown a full-fledged type, which is what this patch wants to do.
>
> My intention was to make can_coerce_type and coerce_type consistent --
> right now, coerce_type can fail after can_coerce_type returns true.
>
> (I also wanted to improve the composability of subqueries, but I got
> enough resistance that I'm setting that argument aside.)
>
> It really has nothing to do with creating real tables with real columns
> of type unknown.
>
>    => select u=t from (select 'x' as u, 'y'::text as t) s;
>    ERROR:  failed to find conversion function from unknown to text
>
> That's an elog (not an ereport, just plain elog) for what is a
> high-level query compilation error of a fairly sane-looking query, which
> seems wrong.
>
> The code comment above the error says that the caller blew it, but as
> far as I can tell there's nothing the caller can do about it. That seems
> wrong, too.

I agree.  I'm not sure what the right fix is, but that query should
probably work, and if it must fail, it certainly shouldn't elog().

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Failure to coerce unknown type to specific type

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, May 1, 2015 at 1:50 AM, Jeff Davis <pgsql@j-davis.com> wrote:
>> => select u=t from (select 'x' as u, 'y'::text as t) s;
>> ERROR:  failed to find conversion function from unknown to text
>>
>> That's an elog (not an ereport, just plain elog) for what is a
>> high-level query compilation error of a fairly sane-looking query, which
>> seems wrong.

> I agree.  I'm not sure what the right fix is, but that query should
> probably work, and if it must fail, it certainly shouldn't elog().

What really ought to happen here, IMO, is that the output columns of the
sub-select ought to get resolved to non-unknown types while we are doing
parse analysis of the sub-select.

I believe the core reason why we haven't done this ages ago is that
currently, the "right thing" will happen if you do this:

       insert into t (a, b) select x, 'foo' from s;

namely that after the sub-select is parsed, the INSERT will reach down and
coerce the unknown literal to the datatype of b.  If we force the output
of the sub-select to text earlier, that will stop working (or at least, it
will only work in cases where there's an assignment cast from text to b's
type).  That's a horrid kluge, but if memory serves it's required to
handle some case the SQL spec expects to work.

It's conceivable that we could preserve the desirable aspects of
INSERT/SELECT while eliminating "unknown" outputs from sub-selects,
if INSERT were to pass down some context saying "here are the target
column types I want", and then the resolution rule in SELECT target list
processing would be "if an output column is UNKNOWN, coerce it to the
target type provided by context if any, otherwise coerce to text".
Not sure how much overhead this would add ... probably not very much,
since INSERT will have to identify the target column types sooner or
later.

            regards, tom lane

Re: Failure to coerce unknown type to specific type

From
Jeff Davis
Date:
On Fri, 2015-05-01 at 10:08 -0700, Tom Lane wrote:
> What really ought to happen here, IMO, is that the output columns of the
> sub-select ought to get resolved to non-unknown types while we are doing
> parse analysis of the sub-select.

So, what would happen for something like:

  select u+i from (select '1' as u, '2'::int as i) s;

?

I can see two possibilities:

1. Resolve "u" from the subselect as text, and later fail to find a
match for +(text,int).
2. Resolve +(unknown, int) to +(int, int) first, then inform the
subselect that it's looking for an int (in the same way that you propose
the insert pass down some context).

I don't think the second one really makes sense though. For example:

  select u+i, u||'suffix'::text from (select '1' as u, '2'::int as i) s;

In that case, "+" would be resolved to +(int, int) and || would be
resolved to ||(text, text). But "u" from the subselect can't be both an
int and text.

Then again, we probably want to fail a query like that anyway. So maybe
it does make sense as long as we can figure out a single type for "u",
and we fail otherwise.

Regards,
    Jeff Davis

Re: Failure to coerce unknown type to specific type

From
Tom Lane
Date:
Jeff Davis <pgsql@j-davis.com> writes:
> On Fri, 2015-05-01 at 10:08 -0700, Tom Lane wrote:
>> What really ought to happen here, IMO, is that the output columns of the
>> sub-select ought to get resolved to non-unknown types while we are doing
>> parse analysis of the sub-select.

> So, what would happen for something like:
>   select u+i from (select '1' as u, '2'::int as i) s;

I don't think there's any useful alternative to failing on this type of
case.  You can't realistically postpone resolution of the subquery output
types long enough for outer-level expression resolution to provide
context.  Even if you could, the behavior wouldn't be very well defined,
because (as you note) there might be more than one such expression leading
to contradictory results.

I think it's reasonable to try to let context inform resolution of the
subquery output types where there is exactly one immediate source of
context, such as the INSERT/SELECT case or UNION/INTERSECT/EXCEPT cases.
(Upthread, Kevin whines about our handling of UNION, but that's possibly
fixable.)

            regards, tom lane

Re: Failure to coerce unknown type to specific type

From
Tom Lane
Date:
Kevin Grittner <kgrittn@ymail.com> writes:
> I recall two constructs that we had in production that caused some
> pain moving to PostgreSQL.

> Here's one:

> test=# insert into x values (coalesce(null, null));
> ERROR:  column "d" is of type date but expression is of type text

I don't have a lot of sympathy for that one.  coalesce(null, null)
isn't legal at all per SQL spec, for essentially the reason SQL Server
gives:

> At least one of the arguments to COALESCE must be an expression that is not the NULL constant.

Otherwise the result type of coalesce() isn't well-defined, and there is
nothing at all in the spec that would suggest looking to surrounding
context to decide that.  Our choice to resolve it as text rather than
failing is admittedly a bit arbitrary, but I don't find it unreasonable.

> Here the other:

> test=# select null as ts union all select null union all select now();
> ERROR:  UNION types text and timestamp with time zone cannot be matched

Yeah, this one is a bit annoying, especially considering we do get it
right in related cases:

regression=# select null as ts union all (select null union all select now());
              ts
-------------------------------


 2015-05-03 13:05:30.639594-04
(3 rows)

It's possible this could be fixed with some rejiggering of parse analysis
so that matching of output-column types is performed across a whole
set-operation tree at once rather than on binary pairs of leaf queries.

On the other hand, a case could be made that such behavior would also be
in violation of the standard, which is perfectly clear that you process
set operations as binary pairs not holistically.  There would certainly
be some compatibility risk involved in changing the resolution behavior
like that, especially for cases where the type choice affects the set
operation's behavior significantly.

            regards, tom lane

Re: Failure to coerce unknown type to specific type

From
Jeff Davis
Date:
On Sun, 2015-05-03 at 13:00 -0400, Tom Lane wrote:
> I don't think there's any useful alternative to failing on this type of
> case.  You can't realistically postpone resolution of the subquery output
> types long enough for outer-level expression resolution to provide
> context.  Even if you could, the behavior wouldn't be very well defined,
> because (as you note) there might be more than one such expression leading
> to contradictory results.
>
> I think it's reasonable to try to let context inform resolution of the
> subquery output types where there is exactly one immediate source of
> context, such as the INSERT/SELECT case or UNION/INTERSECT/EXCEPT cases.
> (Upthread, Kevin whines about our handling of UNION, but that's possibly
> fixable.)

OK, your proposal sounds good to me then.

Regards,
    Jeff Davis

Re: Failure to coerce unknown type to specific type

From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
>> On Fri, 2015-05-01 at 10:08 -0700, Tom Lane wrote:
>>> What really ought to happen here, IMO, is that the output
>>> columns of the sub-select ought to get resolved to non-unknown
>>> types while we are doing parse analysis of the sub-select.
>>
>> So, what would happen for something like:
>>  select u+i from (select '1' as u, '2'::int as i) s;
>
> I don't think there's any useful alternative to failing on this
> type of case.  You can't realistically postpone resolution of the
> subquery output types long enough for outer-level expression
> resolution to provide context.

According to SQL Fiddle: MySQL, SQL Lite, and MS SQL Server all
come up with 3 as the answer.  Oracle 11g R2 (with a one-row table
to avoid the "FROM keyword not found where expected" error) gives
this rather cryptic message:

  ORA-00933: SQL command not properly ended

If I pull out the subquery and run it by itself Oracle gives this:

  Invalid SQL type: sqlKind = UNINITIALIZED

> Even if you could, the behavior wouldn't be very well defined,
> because (as you note) there might be more than one such
> expression leading to contradictory results.

Do you have a simple example of what you mean?  I'm kinda curious
whether the products that manage to handle the above give a sane
error when it becomes ambiguous, or whether they fail in confusing
ways.  It's not like I feel that we need to support a statement
just because other products do, but when I respond to complaints
from users trying to migrate to PostgreSQL, it is useful to have
examples to demonstrate the down side of what other products are
doing.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Failure to coerce unknown type to specific type

From
Tom Lane
Date:
Kevin Grittner <kgrittn@ymail.com> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Jeff Davis <pgsql@j-davis.com> writes:
>>> So, what would happen for something like:
>>> select u+i from (select '1' as u, '2'::int as i) s;

>> I don't think there's any useful alternative to failing on this
>> type of case.  You can't realistically postpone resolution of the
>> subquery output types long enough for outer-level expression
>> resolution to provide context.

> According to SQL Fiddle: MySQL, SQL Lite, and MS SQL Server all
> come up with 3 as the answer.

Really?  The :: syntax is a Postgres-ism, so you surely didn't test
this query on those.

>> Even if you could, the behavior wouldn't be very well defined,
>> because (as you note) there might be more than one such
>> expression leading to contradictory results.

> Do you have a simple example of what you mean?

Jeff already pointed out the issue, but consider

select u+i from (select '1' as u, '2'::int as i) s where u<'foo'::text;

At the very least such a query would behave differently depending on
whether we process the outer query's WHERE clause before or after its
SELECT output list.

            regards, tom lane

Re: Failure to coerce unknown type to specific type

From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Kevin Grittner <kgrittn@ymail.com> writes:
>> I recall two constructs that we had in production that caused some
>> pain moving to PostgreSQL.
>
>> Here's one:
>
>> test=# insert into x values (coalesce(null, null));
>> ERROR:  column "d" is of type date but expression is of type text
>
> I don't have a lot of sympathy for that one.  coalesce(null, null)
> isn't legal at all per SQL spec

I don't get that from my reading of the SQL spec.  A COALESCE
clause is (and always has been) considered a short form of the CASE
clause (not to be mistaken for a function, for example).  The spec
section 6.11 1) c) very explicitly requires
  COALESCE(NULL, NULL)
be the exact equivalent of
  CASE WHEN NULL IS NOT NULL THEN NULL ELSE NULL END

Yet in PostgreSQL the long form of the CASE clause returns the same
thing as a bare NULL, while the short form (COALESCE) gives an
error.  Please indicate what in the spec makes you think that
COALESCE(NULL, NULL) should ever be treated differently from a bare
NULL, because I've looked at the spec and I'm not seeing anything
to support what you said.

> Otherwise the result type of coalesce() isn't well-defined, and there is
> nothing at all in the spec that would suggest looking to surrounding
> context to decide that.

The definition of COALESCE says that when there are different types
the result type should be determined according to section 9.3
(Result of data type combinations).  Because the organization of
our code doesn't lend itself well to conforming to the standard in
that regard, I realize that we are dealing in practical
compromises; but let's not pretend the spec is not clear about
this.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Failure to coerce unknown type to specific type

From
Kevin Grittner
Date:
Kevin Grittner <kgrittn@ymail.com> wrote:

> The spec section 6.11 1) c) very explicitly requires
>   COALESCE(NULL, NULL)
> be the exact equivalent of
>   CASE WHEN NULL IS NOT NULL THEN NULL ELSE NULL END
>
> Yet in PostgreSQL the long form of the CASE clause returns the
> same thing as a bare NULL, while the short form (COALESCE) gives
> an error.

Never mind; I got confused in my testing -- they both return NULL
of type text.  I'll dig on that one some more.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Failure to coerce unknown type to specific type

From
Tom Lane
Date:
Kevin Grittner <kgrittn@ymail.com> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I don't have a lot of sympathy for that one.  coalesce(null, null)
>> isn't legal at all per SQL spec

> I don't get that from my reading of the SQL spec.  A COALESCE
> clause is (and always has been) considered a short form of the CASE
> clause (not to be mistaken for a function, for example).  The spec
> section 6.11 1) c) very explicitly requires
>   COALESCE(NULL, NULL)
> be the exact equivalent of
>   CASE WHEN NULL IS NOT NULL THEN NULL ELSE NULL END

Sure, and that isn't legal per spec either (see below).

> Yet in PostgreSQL the long form of the CASE clause returns the same
> thing as a bare NULL, while the short form (COALESCE) gives an
> error.

Hm?  I get the same thing for either variant:

regression=# create table x (d date);
CREATE TABLE
regression=# insert into x values (coalesce(null, null));
ERROR:  column "d" is of type date but expression is of type text
LINE 1: insert into x values (coalesce(null, null));
                              ^
HINT:  You will need to rewrite or cast the expression.
regression=# insert into x values ( CASE WHEN NULL IS NOT NULL THEN NULL ELSE NULL END );
ERROR:  column "d" is of type date but expression is of type text
LINE 1: insert into x values ( CASE WHEN NULL IS NOT NULL THEN NULL ...
                               ^
HINT:  You will need to rewrite or cast the expression.


The reason these things aren't legal per spec is that the spec says that
a bare NULL keyword is a <contextually typed value specification> a/k/a
<implicitly typed value specification>, and those are only valid in
situations where a type can be inferred from the *immediate* context.
For example,
    insert into x values (null);
is legal because the source of an INSERT can be a
<contextually typed table value constructor> which is a VALUES clause
that can contain a <contextually typed value specification>.  On the
other hand, result subexpressions of a CASE are just <value expression>s,
and you will not find any production that allows a bare NULL literal
to be a <value expression>.  So far as I can find in SQL:2008, the
only contexts where <contextually typed anything> is syntactically
legal are (1) INSERT, MERGE, and UPDATE source expressions, (2) CAST
source expressions, and (3) table-column DEFAULT expressions, all of
which have a well-defined target type available from the immediately
surrounding semantic context.

In short, the text of the spec only allows a bare NULL literal as
the immediate argument of one of those constructs.  Allowing it in
any other context is an extension.

The spec doesn't have a notion of unknown-type literal in the same way
we do, but we've modeled our handling of those as being like bare NULL
literals for type resolution purposes.

Our choice has been to resolve as text in situations where there is no
other info available from immediate context.  I agree with the spec
that it would be a bad idea to allow "action at a distance" in the
sense of allowing such info to propagate through multiple levels
of semantic context.

            regards, tom lane

Re: Failure to coerce unknown type to specific type

From
Kevin Grittner
Date:
Kevin Grittner <kgrittn@ymail.com> wrote:

> they both return NULL of type text.

Section 9.3, which the definition of COALESCE references as the
way to resolve type conflicts *starts* with this:

| Let IDTS be the set of data types specified in an application of
| this Subclause. Let DTS be the set of data types in IDTS
| excluding any data types that are undefined. If the cardinality
| of DTS is 0 (zero), then the result data type is undefined and no
| further Rules of this Subclause are evaluated.

That's pretty unambiguous that a COALESCE(NULL, NULL) clause should
yield a NULL with the data type undefined.  So, if we're going by
the SQL spec, that is what it would take to conform.  It's hardly a
crisis that we don't conform on this point, but it is a fact.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Failure to coerce unknown type to specific type

From
Tom Lane
Date:
Kevin Grittner <kgrittn@ymail.com> writes:
> Section 9.3, which the definition of COALESCE references as the
> way to resolve type conflicts *starts* with this:

> | Let IDTS be the set of data types specified in an application of
> | this Subclause. Let DTS be the set of data types in IDTS
> | excluding any data types that are undefined. If the cardinality
> | of DTS is 0 (zero), then the result data type is undefined and no
> | further Rules of this Subclause are evaluated.

> That's pretty unambiguous that a COALESCE(NULL, NULL) clause should
> yield a NULL with the data type undefined.

This is irrelevant, because such a construct fails the syntax rules
and thus we never get to the question of what type should be inferred,
at least not without going outside the spec.  See my other reply.

            regards, tom lane

Re: Failure to coerce unknown type to specific type

From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> This is irrelevant, because such a construct fails the syntax rules
> and thus we never get to the question of what type should be inferred,
> at least not without going outside the spec.  See my other reply.

Yeah, our posts have been crossing a bit.  The point about <value
expression> not allowing a NULL literal is valid.  I yield on that
regarding COALESCE within the spec.  It is an extension to the spec
to allow a NULL literal within a COALESCE clause at all.  We would
surely break a lot of working code to forbid it, though.  If we
*are* going to allow it, it would be pretty confusing to have it
behave differently that what I previously outlined (regarding the
equivalent long form CASE clause).

The <result> from a long form of the CASE clause explicitly does
explicitly allow an untyped NULL literal, and forcing it to text is
wrong per section 9.3.

To save an extra post -- I did modify the statements in SQL Fiddle
to get to the point where the subquery returned a column without a
type and a column with an int type in the dialect supported.  I'm
not sure how that's relevant to the issue about how they resolve
that in the outer query, but I can post the form of the query used
for each product if you think it is germane.

To restate it, this hardly seems like the most important issue to
address; I just don't think the standard gives us much cover here.
What we do for the CASE clause is clearly wrong per spec, and if we
allow a bare NULL in a COALESCE clause it would be crazy not to
have the behavior match CASE.  But it is clearly good form to
always cast a NULL literal to some type, and that is a workaround
which should not be too painful for most people.  We shouldn't rush
to do anything big here, but we should recognize where we stand.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Failure to coerce unknown type to specific type

From
Tom Lane
Date:
I wrote:
> The reason these things aren't legal per spec is that the spec says that
> a bare NULL keyword is a <contextually typed value specification> a/k/a
> <implicitly typed value specification>, and those are only valid in
> situations where a type can be inferred from the *immediate* context.
> For example,
>     insert into x values (null);
> is legal because the source of an INSERT can be a
> <contextually typed table value constructor> which is a VALUES clause
> that can contain a <contextually typed value specification>.  On the
> other hand, result subexpressions of a CASE are just <value expression>s,
> and you will not find any production that allows a bare NULL literal
> to be a <value expression>.  So far as I can find in SQL:2008, the
> only contexts where <contextually typed anything> is syntactically
> legal are (1) INSERT, MERGE, and UPDATE source expressions, (2) CAST
> source expressions, and (3) table-column DEFAULT expressions, all of
> which have a well-defined target type available from the immediately
> surrounding semantic context.

On doing a more thorough search, I see that I missed one reference:
the result expression(s) of a CASE construct are defined as

    <result> ::= <result expression>
                 | NULL
    <result expression> ::= <value expression>

which might seem to put the lie to my thesis, except that in the
Syntax Rules we read

    At least one <result> in a <case specification> shall specify
    a <result expression>.

That means this is legal:

    CASE WHEN ... THEN 42 ELSE NULL END;

but this isn't:

    CASE WHEN ... THEN NULL ELSE NULL END;

So the core point stands.  The spec doesn't allow a bare NULL literal
anywhere that its type can't be determined from the most closely nested
semantic context.

It's a bit weird that they encode this as a syntax rule rather than a
semantic rule, but that's what they did.

            regards, tom lane

Re: Failure to coerce unknown type to specific type

From
Tom Lane
Date:
Kevin Grittner <kgrittn@ymail.com> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This is irrelevant, because such a construct fails the syntax rules
>> and thus we never get to the question of what type should be inferred,
>> at least not without going outside the spec.  See my other reply.

> ...  It is an extension to the spec
> to allow a NULL literal within a COALESCE clause at all.  We would
> surely break a lot of working code to forbid it, though.

Actually, per my last reply, what's illegal per spec is for *all* the arms
to be NULL literals (so SQL Server is indeed enforcing the spec exactly).
As long as there's an arm with an identifiable type, the CASE's result
type can be determined.

> If we
> *are* going to allow it, it would be pretty confusing to have it
> behave differently that what I previously outlined (regarding the
> equivalent long form CASE clause).

AFAICT, we do treat them the same; can you provide an example where
we don't?

> To save an extra post -- I did modify the statements in SQL Fiddle
> to get to the point where the subquery returned a column without a
> type and a column with an int type in the dialect supported.  I'm
> not sure how that's relevant to the issue about how they resolve
> that in the outer query, but I can post the form of the query used
> for each product if you think it is germane.

It may not be.  I suspect what is really going on is that they're
resolving the sub-SELECT output column to TEXT (or local equivalent
idiom) and then being laxer than we are about coercing that type to
other types.  It would be interesting to try variants of the

    select u+i from (select '1' as u, '2'::int as i) s where u<'foo'::text;

example to see what they do if the column has to be converted to two
mutually inconsistent types, assuming you can find candidate types
in each system.  Another idea would be to try things like

    select u+i from (select 'bar' as u, '2'::int as i) s where u<'foo'::text;

and see exactly what error gets thrown.

> To restate it, this hardly seems like the most important issue to
> address; I just don't think the standard gives us much cover here.

I stand by my opinion that the cases that are controversial here
are all illegal per spec.  We may well want to allow them on usability
grounds, but what the spec does *not* provide any cover for is claiming
that the spec requires some particular non-error interpretation.

            regards, tom lane

Re: Failure to coerce unknown type to specific type

From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Kevin Grittner <kgrittn@ymail.com> writes:

>> If we
>> *are* going to allow it, it would be pretty confusing to have it
>> behave differently that what I previously outlined (regarding the
>> equivalent long form CASE clause).
>
> AFAICT, we do treat them the same; can you provide an example where
> we don't?

No, I was just arguing that if we change the CASE clause to return
untyped NULL, then the COALESCE clause should follow suit.

> It would be interesting to try variants of the
>
>     select u+i from (select '1' as u, '2'::int as i) s where u<'foo'::text;
>
> example to see what they do if the column has to be converted to two
> mutually inconsistent types, assuming you can find candidate types
> in each system.

I finagled a syntax that was accepted by all servers on SQL Fiddle
and got a row with a 3 without the WHERE clause from all products
except PostgreSQL:

-- Build Schema
create table onerow (n int not null);
insert into onerow values (1);
-- Run SQL
select u+i from (select cast('1' as char) as u, 2 as i from onerow) s;

Then I added that WHERE clause.

select u+i from (select cast('1' as char) as u, 2 as i from onerow) s
where u<cast('foo' as char);


Much to my amazement, all of them *still* return a row with the
value 3, without error.  I'm still picking my jaw up from the floor.

I'm OK with being in the minority on that!

> Another idea would be to try things like
>
>     select u+i from (select 'bar' as u, '2'::int as i) s where u<'foo'::text;
>
> and see exactly what error gets thrown.

I changed '1' to 'bar' in the above code.

MySQL and SQL Lite return a row with a 2.

Oracle throws an error: ORA-01722: invalid number

MS SQL Server throws an error: Conversion failed when converting the varchar value 'bar ' to data type int.


Yes, my literal was three characters and the error message added a space.


>> To restate it, this hardly seems like the most important issue to
>> address; I just don't think the standard gives us much cover here.
>
> I stand by my opinion that the cases that are controversial here
> are all illegal per spec.

With that last bit you pointed out, I now agree.

> We may well want to allow them on usability
> grounds, but what the spec does *not* provide any cover for is claiming
> that the spec requires some particular non-error interpretation.

ok

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company