Thread: [HACKERS] analyzeCTE is too strict about typmods?

[HACKERS] analyzeCTE is too strict about typmods?

From
Tom Lane
Date:
I'm not sure why bug #7926 didn't get any love when filed,
but the issue came up again today:
https://www.postgresql.org/message-id/264036359.6712710.1501784552013@mail.yahoo.com
and it does seem like this is pretty curious behavior.
A minimal reproducer is

regression=# create table base (f1 numeric(7,3));
CREATE TABLE
regression=# with recursive foo as (
select f1 from base
zunion all
select f1+1 from foo
) select * from foo;
ERROR:  recursive query "foo" column 1 has type numeric(7,3) in non-recursive term but type numeric overall
LINE 2: select f1 from base              ^
HINT:  Cast the output of the non-recursive term to the correct type.

Now the thing about that is that the HINT's advice doesn't work:

regression=# with recursive foo as (
select f1::numeric from base
union all
select f1+1 from foo
) select * from foo;
ERROR:  recursive query "foo" column 1 has type numeric(7,3) in non-recursive term but type numeric overall
LINE 2: select f1::numeric from base              ^
HINT:  Cast the output of the non-recursive term to the correct type.

The reason for this is that parse_coerce.c treats casting a value that's
already of the required type to typmod -1 as a complete no-op (see first
check in coerce_type_typmod).  So the result is still just a Var for "f1".

We could imagine fixing this by insisting that a RelabelType with typmod
-1 should be plastered atop the expression in such cases.  But I'm worried
about the potential side-effects of that, and anyway I'm not convinced
that parse_coerce.c is wrong to be doing it this way: typmod -1 generally
means "unspecified typmod", so the bare Var seems like it ought to be
considered to satisfy the typmod spec.  Besdies, if you just do this:

select f1 from base
union all
select f1+1 from base;

it works, producing a UNION result deemed to have typmod -1, and there's
no extra decoration added to the Var in the first leaf SELECT.

In short, therefore, it's looking to me like analyzeCTE() is wrong here.
It should allow the case where the recursive result has typmod -1 while
the non-recursive output column has some more-specific typmod, so long
as they match on type OID.  That would correspond to what we do in
regular non-recursive UNION situations.
        regards, tom lane



Re: [HACKERS] analyzeCTE is too strict about typmods?

From
Tom Lane
Date:
I wrote:
> In short, therefore, it's looking to me like analyzeCTE() is wrong here.
> It should allow the case where the recursive result has typmod -1 while
> the non-recursive output column has some more-specific typmod, so long
> as they match on type OID.  That would correspond to what we do in
> regular non-recursive UNION situations.

Oh, scratch that.  I was thinking that we could simply relax the error
check, but that really doesn't work at all.  The problem is that by now,
we have probably already generated Vars referencing the outputs of the
recursive CTE, and those will have the more-specific typmod, which is
wrong in this scenario.  Usually that wouldn't matter too much, but
there are cases where it would matter.

We could imagine dealing with this by re-parse-analyzing the recursive
term using typmod -1 for the CTE output column, but I don't much want
to go there.  It wouldn't be cheap, and I'm not quite sure it's guaranteed
to converge anyway.

What's seeming like an attractive compromise is to change the HINT
to recommend manually coercing the recursive term, instead of the
non-recursive one.  Adjusting the error cursor to point to that side
might be a bit painful, but it's probably doable.

Thoughts?
        regards, tom lane



Re: [HACKERS] analyzeCTE is too strict about typmods?

From
Kyotaro HORIGUCHI
Date:
At Thu, 03 Aug 2017 19:29:40 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in <28993.1501802980@sss.pgh.pa.us>
> I wrote:
> > In short, therefore, it's looking to me like analyzeCTE() is wrong here.
> > It should allow the case where the recursive result has typmod -1 while
> > the non-recursive output column has some more-specific typmod, so long
> > as they match on type OID.  That would correspond to what we do in
> > regular non-recursive UNION situations.
> 
> Oh, scratch that.  I was thinking that we could simply relax the error
> check, but that really doesn't work at all.  The problem is that by now,
> we have probably already generated Vars referencing the outputs of the
> recursive CTE, and those will have the more-specific typmod, which is
> wrong in this scenario.  Usually that wouldn't matter too much, but
> there are cases where it would matter.
> 
> We could imagine dealing with this by re-parse-analyzing the recursive
> term using typmod -1 for the CTE output column, but I don't much want
> to go there.  It wouldn't be cheap, and I'm not quite sure it's guaranteed
> to converge anyway.

Agreed.

> What's seeming like an attractive compromise is to change the HINT
> to recommend manually coercing the recursive term, instead of the
> non-recursive one.  Adjusting the error cursor to point to that side
> might be a bit painful, but it's probably doable.
> 
> Thoughts?

I agree to the direction, but if I'm not missing anything
transformSetOperationTree has the enough information and we won't
get the expected pain there. (The movement itself might be the
pain, though..)

| ERROR:  recursive query "foo" column 1 has type numeric(7,3) in non-recursive term but type numeric overall
| LINE 4: select f1+1  from foo
|                ^
| HINT:  Cast the output of the recursive term to the type of the non-recursive term.

# The hint gets a bit longer..

By the way a wrong collation still results in the previous hint
but it won't be a problem.

| ERROR:  recursive query "foo" column 1 has collation "it_IT" in non-recursive term but collation "ja_JP" overall
| LINE 2: select a from bar
|                ^
| HINT:  Use the COLLATE clause to set the collation of the non-recursive term.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
*** a/src/backend/parser/analyze.c
--- b/src/backend/parser/analyze.c
***************
*** 42,47 ****
--- 42,49 ---- #include "parser/parse_target.h" #include "parser/parsetree.h" #include "rewrite/rewriteManip.h"
+ #include "utils/builtins.h"
+ #include "utils/lsyscache.h" #include "utils/rel.h"  
***************
*** 1796,1801 **** transformSetOperationTree(ParseState *pstate, SelectStmt *stmt,
--- 1798,1804 ----                           bool isTopLevel, List **targetlist) {     bool        isLeaf;
+     int            varattno;      Assert(stmt && IsA(stmt, SelectStmt)); 
***************
*** 1980,1985 **** transformSetOperationTree(ParseState *pstate, SelectStmt *stmt,
--- 1983,1989 ----         op->colTypmods = NIL;         op->colCollations = NIL;         op->groupClauses = NIL;
+         varattno = 0;         forboth(ltl, ltargetlist, rtl, rtargetlist)         {             TargetEntry *ltle =
(TargetEntry*) lfirst(ltl);
 
***************
*** 2008,2013 **** transformSetOperationTree(ParseState *pstate, SelectStmt *stmt,
--- 2012,2019 ----             else                 rescoltypmod = -1; 
+             varattno++;
+              /*              * Verify the coercions are actually possible.  If not, we'd fail              * later
anyway,but we want to fail now while we have sufficient
 
***************
*** 2110,2115 **** transformSetOperationTree(ParseState *pstate, SelectStmt *stmt,
--- 2116,2161 ----             }              /*
+              * Verify that the previously determined output column types and
+              * collations match what the query really produced.  We have to
+              * check this because the recursive term could have overridden the
+              * non-recursive term, and we don't have any easy way to fix that.
+              */
+             if (isTopLevel &&
+                 pstate->p_parent_cte &&
+                 pstate->p_parent_cte->cterecursive)
+             {
+                 Oid    lcolcoll = exprCollation((Node *)ltle->expr);
+ 
+                 /*
+                  * This might somewhat confusing but we suggest to fix
+                  * recursive term since non-recursive term may have the same
+                  * type without typemod.
+                  */
+                 if (rescoltype != lcoltype || rescoltypmod != lcoltypmod)
+                     ereport(ERROR,
+                             (errcode(ERRCODE_DATATYPE_MISMATCH),
+                              errmsg("recursive query \"%s\" column %d has type %s in non-recursive term but type %s
overall",
+                                     pstate->p_parent_cte->ctename, varattno,
+                                     format_type_with_typemod(lcoltype,
+                                                              lcoltypmod),
+                                     format_type_with_typemod(rescoltype,
+                                                              rescoltypmod)),
+                              errhint("Cast the output of the recursive term to the type of the non-recursive
term."),
+                              parser_errposition(pstate,
+                                                 exprLocation((Node *)rtle->expr))));
+                 if (rescolcoll != lcolcoll)
+                     ereport(ERROR,
+                             (errcode(ERRCODE_COLLATION_MISMATCH),
+                              errmsg("recursive query \"%s\" column %d has collation \"%s\" in non-recursive term but
collation\"%s\" overall",
 
+                                     pstate->p_parent_cte->ctename, varattno,
+                                     get_collation_name(lcolcoll),
+                                     get_collation_name(rescolcoll)),
+                          errhint("Use the COLLATE clause to set the collation of the non-recursive term."),
+                              parser_errposition(pstate, exprLocation((Node *)ltle->expr))));
+             }
+ 
+             /*              * Construct a dummy tlist entry to return.  We use a SetToDefault              * node for
theexpression, since it carries exactly the fields              * needed, but any other expression node type would do
aswell.
 
*** a/src/backend/parser/parse_cte.c
--- b/src/backend/parser/parse_cte.c
***************
*** 276,339 **** analyzeCTE(ParseState *pstate, CommonTableExpr *cte)         /* Compute the output column names/types
ifnot done yet */         analyzeCTETargetList(pstate, cte, GetCTETargetList(cte));     }
 
!     else
!     {
!         /*
!          * Verify that the previously determined output column types and
!          * collations match what the query really produced.  We have to check
!          * this because the recursive term could have overridden the
!          * non-recursive term, and we don't have any easy way to fix that.
!          */
!         ListCell   *lctlist,
!                    *lctyp,
!                    *lctypmod,
!                    *lccoll;
!         int            varattno;
! 
!         lctyp = list_head(cte->ctecoltypes);
!         lctypmod = list_head(cte->ctecoltypmods);
!         lccoll = list_head(cte->ctecolcollations);
!         varattno = 0;
!         foreach(lctlist, GetCTETargetList(cte))
!         {
!             TargetEntry *te = (TargetEntry *) lfirst(lctlist);
!             Node       *texpr;
! 
!             if (te->resjunk)
!                 continue;
!             varattno++;
!             Assert(varattno == te->resno);
!             if (lctyp == NULL || lctypmod == NULL || lccoll == NULL)    /* shouldn't happen */
!                 elog(ERROR, "wrong number of output columns in WITH");
!             texpr = (Node *) te->expr;
!             if (exprType(texpr) != lfirst_oid(lctyp) ||
!                 exprTypmod(texpr) != lfirst_int(lctypmod))
!                 ereport(ERROR,
!                         (errcode(ERRCODE_DATATYPE_MISMATCH),
!                          errmsg("recursive query \"%s\" column %d has type %s in non-recursive term but type %s
overall",
!                                 cte->ctename, varattno,
!                                 format_type_with_typemod(lfirst_oid(lctyp),
!                                                          lfirst_int(lctypmod)),
!                                 format_type_with_typemod(exprType(texpr),
!                                                          exprTypmod(texpr))),
!                          errhint("Cast the output of the non-recursive term to the correct type."),
!                          parser_errposition(pstate, exprLocation(texpr))));
!             if (exprCollation(texpr) != lfirst_oid(lccoll))
!                 ereport(ERROR,
!                         (errcode(ERRCODE_COLLATION_MISMATCH),
!                          errmsg("recursive query \"%s\" column %d has collation \"%s\" in non-recursive term but
collation\"%s\" overall",
 
!                                 cte->ctename, varattno,
!                                 get_collation_name(lfirst_oid(lccoll)),
!                                 get_collation_name(exprCollation(texpr))),
!                          errhint("Use the COLLATE clause to set the collation of the non-recursive term."),
!                          parser_errposition(pstate, exprLocation(texpr))));
!             lctyp = lnext(lctyp);
!             lctypmod = lnext(lctypmod);
!             lccoll = lnext(lccoll);
!         }
!         if (lctyp != NULL || lctypmod != NULL || lccoll != NULL)    /* shouldn't happen */
!             elog(ERROR, "wrong number of output columns in WITH");
!     } }  /*
--- 276,286 ----         /* Compute the output column names/types if not done yet */
analyzeCTETargetList(pstate,cte, GetCTETargetList(cte));     }
 
!     /*
!      * For the recursive cte case, the previously determined output column
!      * types and collations are known to match what the query really
!      * produced. See transformSetOperationStmtTree.
!      */ }  /*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers