Thread: Re: [BUGS] Failure to coerce unknown type to specific type

Re: [BUGS] Failure to coerce unknown type to specific type

From
Jeff Davis
Date:
Moving thread to -hackers.

On Wed, Apr 8, 2015 at 11:18 PM, Jeff Davis <pgsql@j-davis.com> wrote:
> That example was just for illustration. My other example didn't require
> creating a table at all:
>
>   SELECT a=b FROM (SELECT ''::text, '  ') x(a,b);
>
> it's fine with me if we want that to fail, but I don't think we're
> failing in the right place, or with the right error message.
>
> I'm not clear on what rules we're applying such that the above query
> should fail, but:
>
>   SELECT ''::text='  ';
>
> should succeed. Unknown literals are OK, but unknown column references
> are not? If that's the rule, can we catch that earlier, and throw an
> error like 'column reference "b" has unknown type'?

Is the behavior of unknown literals vs. unknown column references
documented anywhere? I tried looking here:
http://www.postgresql.org/docs/devel/static/typeconv.html, but it
doesn't seem to make the distinction between how unknown literals vs.
unknown column references are handled.

My understanding until now has been that unknown types are a
placeholder while still inferring the types. But that leaves a few
questions:

1. Why do we care whether the unknown is a literal or a column reference?
2. Unknown column references seem basically useless, because we will
almost always encounter the "failure to find conversion" error, so why
do we allow them?
3. If unknowns are just a placeholder, why do we return them to the
client? What do we expect the client to do with it?

Regards,   Jeff Davis



Re: [BUGS] Failure to coerce unknown type to specific type

From
"David G. Johnston"
Date:
My apologies if much of this is already assumed knowledge by most -hackers...I'm trying to learn from observation instead of, largely, reading code in a foreign language.

On Wed, Apr 22, 2015 at 6:40 PM, Jeff Davis <pgsql@j-davis.com> wrote:
Moving thread to -hackers.

On Wed, Apr 8, 2015 at 11:18 PM, Jeff Davis <pgsql@j-davis.com> wrote:
> That example was just for illustration. My other example didn't require
> creating a table at all:
>
>   
​​
​​
SELECT a=b FROM (SELECT ''::text, '  ') x(a,b);
>
> it's fine with me if we want that to fail, but I don't think we're
> failing in the right place, or with the right error message.
>
> I'm not clear on what rules we're applying such that the above query
> should fail, but:
>
>   
​​
SELECT ''::text='  ';
 
>
> should succeed. Unknown literals are OK, but unknown column references
> are not? If that's the rule, can we catch that earlier, and throw an
> error like 'column reference "b" has unknown type'?

But the fact that column "b" has the data type "unknown" is only a warning - not an error.

This seems to be a case of the common problem (or, at least recently mentioned) where type conversion only deals with data and not context.


Additional hinting regarding the column containing the offending data would be welcomed by the community - but I suspect it is a non-trivial endeavor.


Is the behavior of unknown literals vs. unknown column references
documented anywhere? I tried looking here:
http://www.postgresql.org/docs/devel/static/typeconv.html, but it
doesn't seem to make the distinction between how unknown literals vs.
unknown column references are handled.

My understanding until now has been that unknown types are a
placeholder while still inferring the types. But that leaves a few
questions:

1. Why do we care whether the unknown is a literal or a column reference?

Apparently the difference is in when non-implicit casts can be used for coercion - or, rather, when input functions can be used instead of casting functions.

in ​SELECT '  '::text = 'a' the explicit cast between the implicit unknown and text is used while going through the subquery forces the planner to locate an implicit cast between the explicit unknown and text.  

​The following fails no matter what you try because no casts exist from unknown to integer:

​​SELECT a::int=b FROM (SELECT '1', 1) x(a,b);

but this too works - which is why the implicit cast concept above fails (I'm leaving it since the thought process may help in understanding):

SELECT 1 = '1';

From which I infer that an unknown literal is allowed to be fed directly into a type's input function to facilitate a direct coercion.
Writing this makes me wish for more precise terminology...is there something already established here?  "untyped" versus "unknown" makes sense to me.  untyped literals only exist within the confines of a single node and can be passed through a type's input function to make them take on a type.  If the untyped reference passes through the node without having been assigned an explicit type it is assigned the unknown type.

2. Unknown column references seem basically useless, because we will
almost always encounter the "failure to find conversion" error, so why
do we allow them?

At this point...backward compatibility?

I do get a warning in psql (9.3.6) from your original -bugs example

create table a(u) as select '1';

WARNING: "column "u" has type "unknown"​
DETAIL:  Proceeding with relation creation anyway.

Related question: was there ever a time when the above failed instead of just supplying a warning?

My git-fu is not super strong but the above warning was last edited by Tom Lane back in 2003 (d8528630) though it was just a refactor - the warning was already present.  I suppose after 12 years the "why" doesn't matter so much...

create table b(i int);
insert into b select u from a;
ERROR:  failed to find conversion function from unknown to integer

Text appears to have a cast defined:

SELECT u::text FROM a;
 
3. If unknowns are just a placeholder, why do we return them to the
client? What do we expect the client to do with it?

​We do?​  I suspect that it is effectively treated as if it were text by client libraries.

​My gut reaction is if you feel strongly enough to add some additional documentation or warnings/hints/details related to this topic they probably would get put in; but disallowing "unknown" as first-class type is likely to fail to pass a cost-benefit evaluation.

Distinguishing between "untyped" literals and "unknown type" literals seems promising concept to aid in understanding the difference in the face of not being able (or wanting) to actually change the behavior.

David J.

Re: [BUGS] Failure to coerce unknown type to specific type

From
Jeff Davis
Date:
On Wed, 2015-04-22 at 20:35 -0700, David G. Johnston wrote:

> But the fact that column "b" has the data type "unknown" is only a
> warning - not an error.
> 
I get an error:

postgres=# SELECT '  '::text = 'a';?column? 
----------f
(1 row)

postgres=# SELECT a=b FROM (SELECT ''::text, '  ') x(a,b);
ERROR:  failed to find conversion function from unknown to text

So that means the column reference "b" is treated differently than the
literal. Here I don't mean a reference to an actual column of a real
table, just an identifier ("b") that parses as a columnref.

Creating the table gives you a warning (not an error), but I think that
was a poor example for me to choose, and not important to my point.
> 
> This seems to be a case of the common problem (or, at least recently
> mentioned) where type conversion only deals with data and not context.
> 
> 
> http://www.postgresql.org/message-id/CADx9qBmVPQvSH3
> +2cH4cwwPmphW1mE18e=WUmLFUC-QZ-t7Q6Q@mail.gmail.com
> 
> 
I think that is a different problem. That's a runtime type conversion
error (execution time), and I'm talking about something happening at
parse analysis time.

> 
> but this too works - which is why the implicit cast concept above
> fails (I'm leaving it since the thought process may help in
> understanding):
> 
> 
> SELECT 1 = '1';
> 
> 
> From which I infer that an unknown literal is allowed to be fed
> directly into a type's input function to facilitate a direct coercion.

Yes, I believe that's what's happening. When we use an unknown literal,
it's acting more like a value constructor and will pass it to the type
input function. When it's a columnref, even if unknown, it tries to cast
it and fails.

But that is very confusing. In the example at the top of this email, it
seems like the second query should be equivalent to the first, or even
that postgres should be able to rewrite the second into the first. But
the second query fails where the first succeeds.


> At this point...backward compatibility?

Backwards compatibility of what queries? I guess the ones that return
unknowns to the client or create tables with unknown columns?

> create table a(u) as select '1';
> 
> 
> WARNING: "column "u" has type "unknown"​
> DETAIL:  Proceeding with relation creation anyway.
> 
> 
> Related question: was there ever a time when the above failed instead
> of just supplying a warning?

Not that I recall.



> ​My gut reaction is if you feel strongly enough to add some additional
> documentation or warnings/hints/details related to this topic they
> probably would get put in; but disallowing "unknown" as first-class
> type is likely to fail to pass a cost-benefit evaluation.

I'm not proposing that we eliminate unknown. I just think columnrefs and
literals should behave consistently. If we really don't want unknown
columnrefs, it seems like we could at least throw a better error.

If we were starting from scratch, I'd also not return unknown to the
client, but we have to worry about the backwards compatibility.

> Distinguishing between "untyped" literals and "unknown type" literals
> seems promising concept to aid in understanding the difference in the
> face of not being able (or wanting) to actually change the behavior.

Not sure I understand that proposal, can you elaborate?

Regards,Jeff Davis






Re: [BUGS] Failure to coerce unknown type to specific type

From
Kyotaro HORIGUCHI
Date:
Hello, I think this is a bug.

The core of this problem is that coerce_type() fails for Var of
type UNKNOWNOID.

The comment for the function says that,

> * The caller should already have determined that the coercion is possible;
> * see can_coerce_type.

But can_coerce_type() should say it's possible to convert from
unknown to any type as it doesn't see the target node type. I
think this as an inconsistency between can_coerce_type and
coerce_type. So making this consistent would be right way.

Concerning only this issue, putting on-the-fly conversion for
unkown nonconstant as attached patch worked for me. I'm not so
confident on this, though..

regards,

At Wed, 22 Apr 2015 23:26:43 -0700, Jeff Davis <pgsql@j-davis.com> wrote in <1429770403.4604.22.camel@jeff-desktop>
> On Wed, 2015-04-22 at 20:35 -0700, David G. Johnston wrote:
> 
> > But the fact that column "b" has the data type "unknown" is only a
> > warning - not an error.
> > 
> I get an error:
> 
> postgres=# SELECT '  '::text = 'a';
>  ?column? 
> ----------
>  f
> (1 row)
> 
> postgres=# SELECT a=b FROM (SELECT ''::text, '  ') x(a,b);
> ERROR:  failed to find conversion function from unknown to text
> 
> So that means the column reference "b" is treated differently than the
> literal. Here I don't mean a reference to an actual column of a real
> table, just an identifier ("b") that parses as a columnref.
> 
> Creating the table gives you a warning (not an error), but I think that
> was a poor example for me to choose, and not important to my point.
> > 
> > This seems to be a case of the common problem (or, at least recently
> > mentioned) where type conversion only deals with data and not context.
> > 
> > 
> > http://www.postgresql.org/message-id/CADx9qBmVPQvSH3
> > +2cH4cwwPmphW1mE18e=WUmLFUC-QZ-t7Q6Q@mail.gmail.com
> > 
> > 
> I think that is a different problem. That's a runtime type conversion
> error (execution time), and I'm talking about something happening at
> parse analysis time.
> 
> > 
> > but this too works - which is why the implicit cast concept above
> > fails (I'm leaving it since the thought process may help in
> > understanding):
> > 
> > 
> > SELECT 1 = '1';
> > 
> > 
> > From which I infer that an unknown literal is allowed to be fed
> > directly into a type's input function to facilitate a direct coercion.
> 
> Yes, I believe that's what's happening. When we use an unknown literal,
> it's acting more like a value constructor and will pass it to the type
> input function. When it's a columnref, even if unknown, it tries to cast
> it and fails.
> 
> But that is very confusing. In the example at the top of this email, it
> seems like the second query should be equivalent to the first, or even
> that postgres should be able to rewrite the second into the first. But
> the second query fails where the first succeeds.
> 
> 
> > At this point...backward compatibility?
> 
> Backwards compatibility of what queries? I guess the ones that return
> unknowns to the client or create tables with unknown columns?
> 
> > create table a(u) as select '1';
> > 
> > 
> > WARNING: "column "u" has type "unknown"​
> > DETAIL:  Proceeding with relation creation anyway.
> > 
> > 
> > Related question: was there ever a time when the above failed instead
> > of just supplying a warning?
> 
> Not that I recall.
> 
> 
> 
> > ​My gut reaction is if you feel strongly enough to add some additional
> > documentation or warnings/hints/details related to this topic they
> > probably would get put in; but disallowing "unknown" as first-class
> > type is likely to fail to pass a cost-benefit evaluation.
> 
> I'm not proposing that we eliminate unknown. I just think columnrefs and
> literals should behave consistently. If we really don't want unknown
> columnrefs, it seems like we could at least throw a better error.
> 
> If we were starting from scratch, I'd also not return unknown to the
> client, but we have to worry about the backwards compatibility.
> 
> > Distinguishing between "untyped" literals and "unknown type" literals
> > seems promising concept to aid in understanding the difference in the
> > face of not being able (or wanting) to actually change the behavior.
> 
> Not sure I understand that proposal, can you elaborate?

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c
index a4e494b..b64d40b 100644
--- a/src/backend/parser/parse_coerce.c
+++ b/src/backend/parser/parse_coerce.c
@@ -221,7 +221,7 @@ coerce_type(ParseState *pstate, Node *node,            return node;        }    }
-    if (inputTypeId == UNKNOWNOID && IsA(node, Const))
+    if (inputTypeId == UNKNOWNOID)    {        /*         * Input is a string constant with previously undetermined
type.Apply
 
@@ -275,6 +275,29 @@ coerce_type(ParseState *pstate, Node *node,        targetType = typeidType(baseTypeId);
+        /* Perform on the fly conversion for non-constants */
+        if(!IsA(node, Const))
+        {
+            Form_pg_type typform = (Form_pg_type) GETSTRUCT(targetType);
+            Node *result = 
+                (Node*) makeFuncExpr(typform->typinput,
+                         targetTypeId,
+                         list_make3(node,
+                                    makeConst(OIDOID, -1, InvalidOid,
+                                              sizeof(Oid),
+                                              ObjectIdGetDatum(InvalidOid),
+                                              false, true),
+                                    makeConst(INT4OID, -1, InvalidOid,
+                                              sizeof(uint32),
+                                              Int32GetDatum(inputTypeMod),
+                                              false, true)),
+                          InvalidOid, InvalidOid,
+                          COERCE_IMPLICIT_CAST);
+            ReleaseSysCache(targetType);
+
+            return result;
+        }
+        newcon->consttype = baseTypeId;        newcon->consttypmod = inputTypeMod;        newcon->constcollid =
typeTypeCollation(targetType);

Re: [BUGS] Failure to coerce unknown type to specific type

From
Kyotaro HORIGUCHI
Date:
Sorry, the patch had obvious bug..

-+                                              Int32GetDatum(inputTypeMod),
++                                              Int32GetDatum(targetTypeMod),

regards,


> Hello, I think this is a bug.
> 
> The core of this problem is that coerce_type() fails for Var of
> type UNKNOWNOID.
> 
> The comment for the function says that,
> 
> > * The caller should already have determined that the coercion is possible;
> > * see can_coerce_type.
> 
> But can_coerce_type() should say it's possible to convert from
> unknown to any type as it doesn't see the target node type. I
> think this as an inconsistency between can_coerce_type and
> coerce_type. So making this consistent would be right way.
> 
> Concerning only this issue, putting on-the-fly conversion for
> unkown nonconstant as attached patch worked for me. I'm not so
> confident on this, though..
> 
> regards,
> 
> At Wed, 22 Apr 2015 23:26:43 -0700, Jeff Davis <pgsql@j-davis.com> wrote in <1429770403.4604.22.camel@jeff-desktop>
> > On Wed, 2015-04-22 at 20:35 -0700, David G. Johnston wrote:
> > 
> > > But the fact that column "b" has the data type "unknown" is only a
> > > warning - not an error.
> > > 
> > I get an error:
> > 
> > postgres=# SELECT '  '::text = 'a';
> >  ?column? 
> > ----------
> >  f
> > (1 row)
> > 
> > postgres=# SELECT a=b FROM (SELECT ''::text, '  ') x(a,b);
> > ERROR:  failed to find conversion function from unknown to text

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c
index a4e494b..2e3a43c 100644
--- a/src/backend/parser/parse_coerce.c
+++ b/src/backend/parser/parse_coerce.c
@@ -221,7 +221,7 @@ coerce_type(ParseState *pstate, Node *node,            return node;        }    }
-    if (inputTypeId == UNKNOWNOID && IsA(node, Const))
+    if (inputTypeId == UNKNOWNOID)    {        /*         * Input is a string constant with previously undetermined
type.Apply
 
@@ -275,6 +275,29 @@ coerce_type(ParseState *pstate, Node *node,        targetType = typeidType(baseTypeId);
+        /* Perform on the fly conversion for non-constants */
+        if(!IsA(node, Const))
+        {
+            Form_pg_type typform = (Form_pg_type) GETSTRUCT(targetType);
+            Node *result = 
+                (Node*) makeFuncExpr(typform->typinput,
+                         targetTypeId,
+                         list_make3(node,
+                                    makeConst(OIDOID, -1, InvalidOid,
+                                              sizeof(Oid),
+                                              ObjectIdGetDatum(InvalidOid),
+                                              false, true),
+                                    makeConst(INT4OID, -1, InvalidOid,
+                                              sizeof(uint32),
+                                              Int32GetDatum(targetTypeMod),
+                                              false, true)),
+                          InvalidOid, InvalidOid,
+                          COERCE_IMPLICIT_CAST);
+            ReleaseSysCache(targetType);
+
+            return result;
+        }
+        newcon->consttype = baseTypeId;        newcon->consttypmod = inputTypeMod;        newcon->constcollid =
typeTypeCollation(targetType);

Re: [BUGS] Failure to coerce unknown type to specific type

From
Kyotaro HORIGUCHI
Date:
Now I found a comment at just where I patched,

>  * XXX if the typinput function is not immutable, we really ought to
>  * postpone evaluation of the function call until runtime. But there
>  * is no way to represent a typinput function call as an expression
>  * tree, because C-string values are not Datums. (XXX This *is*
>  * possible as of 7.3, do we want to do it?)

Is it OK to *now* we can do this?

regards,

> Hello, I think this is a bug.
> 
> The core of this problem is that coerce_type() fails for Var of
> type UNKNOWNOID.
> 
> The comment for the function says that,
> 
> > * The caller should already have determined that the coercion is possible;
> > * see can_coerce_type.
> 
> But can_coerce_type() should say it's possible to convert from
> unknown to any type as it doesn't see the target node type. I
> think this as an inconsistency between can_coerce_type and
> coerce_type. So making this consistent would be right way.
> 
> Concerning only this issue, putting on-the-fly conversion for
> unkown nonconstant as attached patch worked for me. I'm not so
> confident on this, though..
> 
> regards,
> 
> At Wed, 22 Apr 2015 23:26:43 -0700, Jeff Davis <pgsql@j-davis.com> wrote in <1429770403.4604.22.camel@jeff-desktop>
> > On Wed, 2015-04-22 at 20:35 -0700, David G. Johnston wrote:
> > 
> > > But the fact that column "b" has the data type "unknown" is only a
> > > warning - not an error.
> > > 
> > I get an error:
> > 
> > postgres=# SELECT '  '::text = 'a';
> >  ?column? 
> > ----------
> >  f
> > (1 row)
> > 
> > postgres=# SELECT a=b FROM (SELECT ''::text, '  ') x(a,b);
> > ERROR:  failed to find conversion function from unknown to text

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [BUGS] Failure to coerce unknown type to specific type

From
Kyotaro HORIGUCHI
Date:
Very sorry for the trash..

===
Now I found a comment at just where I patched,

>  * XXX if the typinput function is not immutable, we really ought to
>  * postpone evaluation of the function call until runtime. But there
>  * is no way to represent a typinput function call as an expression
>  * tree, because C-string values are not Datums. (XXX This *is*
>  * possible as of 7.3, do we want to do it?)

- Is it OK to *now* we can do this?
+ Is it OK to regard that we can do this *now*?

regards,

> Hello, I think this is a bug.
> 
> The core of this problem is that coerce_type() fails for Var of
> type UNKNOWNOID.
> 
> The comment for the function says that,
> 
> > * The caller should already have determined that the coercion is possible;
> > * see can_coerce_type.
> 
> But can_coerce_type() should say it's possible to convert from
> unknown to any type as it doesn't see the target node type. I
> think this as an inconsistency between can_coerce_type and
> coerce_type. So making this consistent would be right way.
> 
> Concerning only this issue, putting on-the-fly conversion for
> unkown nonconstant as attached patch worked for me. I'm not so
> confident on this, though..
> 
> regards,
> 
> At Wed, 22 Apr 2015 23:26:43 -0700, Jeff Davis <pgsql@j-davis.com> wrote in <1429770403.4604.22.camel@jeff-desktop>
> > On Wed, 2015-04-22 at 20:35 -0700, David G. Johnston wrote:
> > 
> > > But the fact that column "b" has the data type "unknown" is only a
> > > warning - not an error.
> > > 
> > I get an error:
> > 
> > postgres=# SELECT '  '::text = 'a';
> >  ?column? 
> > ----------
> >  f
> > (1 row)
> > 
> > postgres=# SELECT a=b FROM (SELECT ''::text, '  ') x(a,b);
> > ERROR:  failed to find conversion function from unknown to text




Re: [BUGS] Failure to coerce unknown type to specific type

From
"David G. Johnston"
Date:
On Wednesday, April 22, 2015, Jeff Davis <pgsql@j-davis.com> wrote:
On Wed, 2015-04-22 at 20:35 -0700, David G. Johnston wrote:

> But the fact that column "b" has the data type "unknown" is only a
> warning - not an error.
>
I get an error:

postgres=# SELECT '  '::text = 'a';
 ?column?
----------
 f
(1 row)

postgres=# SELECT a=b FROM (SELECT ''::text, '  ') x(a,b);
ERROR:  failed to find conversion function from unknown to text

So that means the column reference "b" is treated differently than the
literal. Here I don't mean a reference to an actual column of a real
table, just an identifier ("b") that parses as a columnref.

Creating the table gives you a warning (not an error), but I think that
was a poor example for me to choose, and not important to my point.

I get the point but the warning stems from converting from untyped to unknown.  Then you get the error looking for an implicit cast from unknown.  The error you had stated referred to the first situation, the conversion of untyped to unknown.

>
> This seems to be a case of the common problem (or, at least recently
> mentioned) where type conversion only deals with data and not context.
>
>
> http://www.postgresql.org/message-id/CADx9qBmVPQvSH3
> +2cH4cwwPmphW1mE18e=WUmLFUC-QZ-t7Q6Q@mail.gmail.com
>
>
I think that is a different problem. That's a runtime type conversion
error (execution time), and I'm talking about something happening at
parse analysis time.

Again, referring here to why your proposed error seems unlikely in face of similar errors not currently providing sufficient context either.  I don't know enough to posit why this is the case.


>
> but this too works - which is why the implicit cast concept above
> fails (I'm leaving it since the thought process may help in
> understanding):
>
>
> SELECT 1 = '1';
>
>
> From which I infer that an unknown literal is allowed to be fed
> directly into a type's input function to facilitate a direct coercion.

Yes, I believe that's what's happening. When we use an unknown literal,
it's acting more like a value constructor and will pass it to the type
input function. When it's a columnref, even if unknown, it tries to cast
it and fails.

But that is very confusing. In the example at the top of this email, it
seems like the second query should be equivalent to the first, or even
that postgres should be able to rewrite the second into the first. But
the second query fails where the first succeeds.

Agreed (sorta, I can understand your PoV) - but it is consistently confusing...and quite obvious when you've changed from one to the other.

Is there something concrete to dig teeth into here?



> At this point...backward compatibility?

Backwards compatibility of what queries? I guess the ones that return
unknowns to the client or create tables with unknown columns?

Yes, disallowing unknown and requiring everything to be untyped or error.


> create table a(u) as select '1';
>
>
> WARNING: "column "u" has type "unknown"​
> DETAIL:  Proceeding with relation creation anyway.
>
>
> Related question: was there ever a time when the above failed instead
> of just supplying a warning?

Not that I recall.



> ​My gut reaction is if you feel strongly enough to add some additional
> documentation or warnings/hints/details related to this topic they
> probably would get put in; but disallowing "unknown" as first-class
> type is likely to fail to pass a cost-benefit evaluation.

I'm not proposing that we eliminate unknown. I just think columnrefs and
literals should behave consistently. If we really don't want unknown
columnrefs, it seems like we could at least throw a better error.

We do allow unknown column refs.  We don't allow you to do much with them though - given the lack of casts, implicit and otherwise.  The error that result from that situation are where the complaint lies.  Since we cannot disallow unknown column refs the question is can the resultant errors be improved.  I really don't see value in expending effort solely trying to improve this limited situation.  If the same effort also improves a wider swath of the code base then great.

The only other option is to allow unknowns to be implicitly cast to text and then fed into the input type just like an untyped literal would.  But those are not the same thing - no matter how similar your two mock queries make them seem - and extrapolation from those two alone doesn't seem justified. And his is crux of where your similarity falls apart.  If you can justify the above behavior then maybe...
 

If we were starting from scratch, I'd also not return unknown to the
client, but we have to worry about the backwards compatibility.

> Distinguishing between "untyped" literals and "unknown type" literals
> seems promising concept to aid in understanding the difference in the
> face of not being able (or wanting) to actually change the behavior.

Not sure I understand that proposal, can you elaborate?

Purely documentation explaining and naming the two different behaviors you are seeing.

Reading and writing all this I'm convinced you have gotten the idea in your mind an expectation of equivalency and consistency where there really is little or none from an overall design perspective.  And none insofar as would merit trying to force the two example queries you provide to behave identically.  There are a number of things about SQL that one either simply lives with or goes through mind contortions to understand the, possibly imperfect, reasoning behind.  This is one of those things: and while it's been fun to do those contortions in the end I am only a little bit better off than when I simply accepted the fact the unknown and untyped were similar but different (even if I hadn't considered giving them different names).

Literals and column references are different.  If you put a literal into a column you lose the ability to then treat it as a literal.

David J.

Re: [BUGS] Failure to coerce unknown type to specific type

From
"David G. Johnston"
Date:
On Thu, Apr 23, 2015 at 1:49 AM, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Wednesday, April 22, 2015, Jeff Davis <pgsql@j-davis.com> wrote:
On Wed, 2015-04-22 at 20:35 -0700, David G. Johnston wrote:

> ​My gut reaction is if you feel strongly enough to add some additional
> documentation or warnings/hints/details related to this topic they
> probably would get put in; but disallowing "unknown" as first-class
> type is likely to fail to pass a cost-benefit evaluation.

I'm not proposing that we eliminate unknown. I just think columnrefs and
literals should behave consistently. If we really don't want unknown
columnrefs, it seems like we could at least throw a better error.

We do allow unknown column refs.  We don't allow you to do much with them though - given the lack of casts, implicit and otherwise.  The error that result from that situation are where the complaint lies.  Since we cannot disallow unknown column refs the question is can the resultant errors be improved.  I really don't see value in expending effort solely trying to improve this limited situation.  If the same effort also improves a wider swath of the code base then great.


​Slightly tangential but a pair of recent threads 


where I pondered about polymorphic functions got me thinking about the following function definition:

create function poly(inarg anyelement, testarg unknown) returns anyelement
AS $$
BEGIN
   SELECT inarg;
END;
$$
LANGUAGE plpgsql
;

Which indeed works...

Jim's "variant" type largely mirrors the behavior desired in this thread.

The lack of casting fails to unknown makes it an unsuitable alternative for "variant" though maybe if indeed we allow type coercion to work it would become viable.  But the same concerns apply as well.

David J.

Re: [BUGS] Failure to coerce unknown type to specific type

From
"David G. Johnston"
Date:
On Thu, Apr 23, 2015 at 1:35 AM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Very sorry for the trash..

===
Now I found a comment at just where I patched,

>  * XXX if the typinput function is not immutable, we really ought to
>  * postpone evaluation of the function call until runtime. But there
>  * is no way to represent a typinput function call as an expression
>  * tree, because C-string values are not Datums. (XXX This *is*
>  * possible as of 7.3, do we want to do it?)

- Is it OK to *now* we can do this?
+ Is it OK to regard that we can do this *now*?


In this patch or a different one?  Does this comment have anything to do with the concern of this thread?

David J.
 

Re: [BUGS] Failure to coerce unknown type to specific type

From
"David G. Johnston"
Date:
On Thu, Apr 23, 2015 at 1:07 AM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello, I think this is a bug.

The core of this problem is that coerce_type() fails for Var of
type UNKNOWNOID.

The comment for the function says that,

> * The caller should already have determined that the coercion is possible;
> * see can_coerce_type.

But can_coerce_type() should say it's possible to convert from
unknown to any type as it doesn't see the target node type. I
think this as an inconsistency between can_coerce_type and
coerce_type. So making this consistent would be right way.


​You have two pieces of contradictory knowledge - how are you picking which one to "fix"?​

Concerning only this issue, putting on-the-fly conversion for
unkown nonconstant as attached patch worked for me. I'm not so
confident on this, though..

​Confident about what aspect - the safety of the patch itself or whether the conversion is even a good idea?​

David J.​

Re: [BUGS] Failure to coerce unknown type to specific type

From
Kyotaro HORIGUCHI
Date:
Hello,

> > Very sorry for the trash..
> >
> > ===
> > Now I found a comment at just where I patched,
> >
> > >  * XXX if the typinput function is not immutable, we really ought to
> > >  * postpone evaluation of the function call until runtime. But there
> > >  * is no way to represent a typinput function call as an expression
> > >  * tree, because C-string values are not Datums. (XXX This *is*
> > >  * possible as of 7.3, do we want to do it?)
> >
> > - Is it OK to *now* we can do this?
> > + Is it OK to regard that we can do this *now*?
> >
> >
> In this patch or a different one?  Does this comment have anything to do
> with the concern of this thread?


The comment cieted above is in the PostgreSQL source file. The
reason why implicit cast (coercion) don't work for subquery's
results of unkown type, but works for constants is in the comment
cited above.

For "select ''::text = ' '", the internal intermediate
representation of the expression looks something like this,
Equal(text_const(''), unknown_const(' '))

and the second parameter can be immediately converted into
text_const(' ') during expression transformation in parse phase.

On the other hand, the problematic query is represented as
follows,
Equal(text_const(a), unknown_const(b))   for select ''::text as a, ' ' as b

But as described in the comment, PostgreSQL knows what to do but
it couldn't be implemented at the time of.. 7.3? But it seems to
be doable now.


By the way, the following query is similar to the problematic one
but doesn't fail.

SELECT a = b FROM (SELECT ''::text, ' ' UNION ALL SELECT '':text,
' ') AS x(a, b);

This is because parsing of UNION immediately converts constants
of unknown type in the UNION's both arms to text so the top level
select won't be bothered by this problem. But the problematic
query doesn't have appropriate timing to do that until the
function I patched.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [BUGS] Failure to coerce unknown type to specific type

From
Kyotaro HORIGUCHI
Date:
Hello,

> > Hello, I think this is a bug.
> >
> > The core of this problem is that coerce_type() fails for Var of
> > type UNKNOWNOID.
> >
> > The comment for the function says that,
> >
> > > * The caller should already have determined that the coercion is
> > possible;
> > > * see can_coerce_type.
> >
> > But can_coerce_type() should say it's possible to convert from
> > unknown to any type as it doesn't see the target node type. I
> > think this as an inconsistency between can_coerce_type and
> > coerce_type. So making this consistent would be right way.
> >
> >
> You have two pieces of contradictory knowledge - how are you picking which
> one to "fix"?

What do you think are contradicting? The patch fixes the problem
that unkown nonconstats cannot get proper conversion and the
query fails.

| SELECT a=b FROM (SELECT ''::text, '  ') x(a,b);
| ERROR:  failed to find conversion function from unknown to text

> > Concerning only this issue, putting on-the-fly conversion for
> > unkown nonconstant as attached patch worked for me. I'm not so
> > confident on this, though..
> >
> 
> Confident about what aspect - the safety of the patch itself or whether
> the conversion is even a good idea?​

Mainly the former, and how far it covers the other similar but
unkown troubles, generality? in other words.

The fix would not so significant but no processing cost or
complexity added, and would reduce astonishment if it works
safely.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [BUGS] Failure to coerce unknown type to specific type

From
Jeff Davis
Date:
On Thu, Apr 23, 2015 at 1:49 AM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> Reading and writing all this I'm convinced you have gotten the idea in your
> mind an expectation of equivalency and consistency where there really is
> little or none from an overall design perspective.  And none insofar as
> would merit trying to force the two example queries you provide to behave
> identically.  There are a number of things about SQL that one either simply
> lives with or goes through mind contortions to understand the, possibly
> imperfect, reasoning behind.  This is one of those things: and while it's
> been fun to do those contortions in the end I am only a little bit better
> off than when I simply accepted the fact the unknown and untyped were
> similar but different (even if I hadn't considered giving them different
> names).

You make some good points, but I disagree for two reasons:

1. This is a postgres issue, not a general SQL issue, so we can't
simply say "SQL is weird" (like we can with a lot of the strange NULL
semantics).
2. Postgres has determined that the unknown column reference "b" in
the query "SELECT a=b FROM (SELECT ''::text, '  ') x(a,b)" can and
should be coerced to text. So postgres has already made that decision.
It's just that when it tries to coerce it, it fails. Even if you
believe that literals and column references are not equivalent, I
don't see any justification at all for this behavior -- postgres
should not decide to coerce it in the first place if it's going to
fail.

Regards,   Jeff Davis



Re: [BUGS] Failure to coerce unknown type to specific type

From
"David G. Johnston"
Date:
On Thursday, April 23, 2015, Jeff Davis <pgsql@j-davis.com> wrote:
On Thu, Apr 23, 2015 at 1:49 AM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> Reading and writing all this I'm convinced you have gotten the idea in your
> mind an expectation of equivalency and consistency where there really is
> little or none from an overall design perspective.  And none insofar as
> would merit trying to force the two example queries you provide to behave
> identically.  There are a number of things about SQL that one either simply
> lives with or goes through mind contortions to understand the, possibly
> imperfect, reasoning behind.  This is one of those things: and while it's
> been fun to do those contortions in the end I am only a little bit better
> off than when I simply accepted the fact the unknown and untyped were
> similar but different (even if I hadn't considered giving them different
> names).

You make some good points, but I disagree for two reasons:

1. This is a postgres issue, not a general SQL issue, so we can't
simply say "SQL is weird" (like we can with a lot of the strange NULL
semantics).

But it is ok to admit that we are weird when we are.  Though yes, we are being inefficient here even with the current behavior taken as desired.

2. Postgres has determined that the unknown column reference "b" in
the query "SELECT a=b FROM (SELECT ''::text, '  ') x(a,b)" can and
should be coerced to text. 

So the error should be "operator does not exist: text = unknown"...instead it tries and fails to cast the unknown to text.

Or allow for the coercion to proceed.

There would be fewer obvious errors, and simple cases that fail would begin to work sensibly, but it still feels like implicit casting from text which was recently largely removed from the system for cause.  Albeit to the disgruntlement of some and accusations of being draconian by others.
 
So postgres has already made that decision.
It's just that when it tries to coerce it, it fails. Even if you
believe that literals and column references are not equivalent, I
don't see any justification at all for this behavior -- postgres
should not decide to coerce it in the first place if it's going to
fail.


This is true - but obviously one solution is to not attempt it in the first place.

David J.

Re: [BUGS] Failure to coerce unknown type to specific type

From
Jim Nasby
Date:
On 4/23/15 5:07 AM, Kyotaro HORIGUCHI wrote:
> This is because parsing of UNION immediately converts constants
> of unknown type in the UNION's both arms to text so the top level
> select won't be bothered by this problem. But the problematic
> query doesn't have appropriate timing to do that until the
> function I patched.

FWIW, I think that's more accidental than anything.

I'm no expert in our casting and type handling code but I spent a lot of 
time stuck in it while working on the variant type, and it seems very 
scattered. There's stuff in the actual casting code, there's some stuff 
in other parts of parse/plan, there's stuff in individual types (array 
and record at least).

Some stuff is handled by casting; some stuff is handled by mangling the 
parse tree.

Something else I noticed is we're not consistent with handling typmod 
either. I don't remember the exact example I found, but there's cases 
involving casting of constants where we ignore it (I don't think it was 
as simple as SELECT 1::int::variant(...), but it was something like that).

I don't know how much of this is just historical and how much is 
intentional, but it'd be nice if we could consolidate it more.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: [BUGS] Failure to coerce unknown type to specific type

From
Kyotaro HORIGUCHI
Date:
Hello,

At Thu, 23 Apr 2015 14:49:29 -0500, Jim Nasby <Jim.Nasby@BlueTreble.com> wrote in <55394CC9.5050703@BlueTreble.com>
> On 4/23/15 5:07 AM, Kyotaro HORIGUCHI wrote:
> > This is because parsing of UNION immediately converts constants
> > of unknown type in the UNION's both arms to text so the top level
> > select won't be bothered by this problem. But the problematic
> > query doesn't have appropriate timing to do that until the
> > function I patched.
> 
> FWIW, I think that's more accidental than anything.

I guess so. It looks not intentional about this behavior at
all.

> I'm no expert in our casting and type handling code but I spent a lot
> of time stuck in it while working on the variant type, and it seems
> very scattered. There's stuff in the actual casting code, there's some
> stuff in other parts of parse/plan, there's stuff in individual types
> (array and record at least).
> 
> Some stuff is handled by casting; some stuff is handled by mangling
> the parse tree.

That's what makes me unconfident. But if coercion is always made
by coerce_type and coercion is properly considered at all places
needs it, and this coercion steps is appropriate, we will see
nothing bad. I hope.

> Something else I noticed is we're not consistent with handling typmod
> either. I don't remember the exact example I found, but there's cases
> involving casting of constants where we ignore it (I don't think it
> was as simple as SELECT 1::int::variant(...), but it was something
> like that).

Mmm.. It's a serious bug if explicit casts are ignored. If some
cast procedures does wrong, it should be fixed.

> I don't know how much of this is just historical and how much is
> intentional, but it'd be nice if we could consolidate it more.

Yeah, but it seems tough to do it throughly.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center