Thread: Re: BUG #15237: I got "ERROR: source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression"
Re: BUG #15237: I got "ERROR: source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression"
From
Andrew Gierth
Date:
[Pruning the CC list and moving this to -hackers] >>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> It's telling you what to do: use a ROW() expression, ie Tom> update fvt_obj_operate_update_table_033 set (c_int) = row(20) Tom> where c_int = 20; >> Yeah, but (a) this used to work, and has worked since at least as >> far back as 9.0, and (b) the spec requires it to work. Tom> As far as (a) goes, it was an intentional compatibility breakage, So here I have a problem: the fact that it was a compatibility breakage seems not to have been discussed *or even so much as mentioned* on -hackers at any time. The original patch, a self-described "lazy-sunday project", made absolutely no mention of any compatibility issue, simply discussed the new options it was allowing, and got no feedback or review. Had it mentioned or even hinted that existing queries might be broken, I would hope that there would have been more discussion at the time. It then apparently went unnoticed until after the release of pg 10, at which point it got retroactively documented (in the release notes and nowhere else), in response to a brief discussion of a user complaint that happened on -general and not -hackers (or even -bugs). Tom> cf commits 86182b189 and 906bfcad7, I encourage everyone to refer to these and decide whether this is sufficient grounds or sufficient discussion to introduce a breaking change, even a minor one. Tom> and as for (b), the SQL committee is just nuts here. We all know that the spec is a dog's breakfast, yes. But breaking backwards compatibility AND the spec, even if it's in order to support other parts of the spec, needs to be something that's actually discussed and the tradeoff assessed rather than being snuck through not only without discussion but also without even asking "is there any sane way we can avoid breaking this". Tom> The expectation in 906bfcad7 was that we'd move towards the Tom> *other* thing the spec demands, namely that the RHS can be any Tom> a_expr yielding a suitable row value. The spec doesn't have anything that corresponds to our a_expr, obviously. Tom> We can't do that if it's unclear what is or isn't a ROW() Tom> expression, because then the intended semantics would be Tom> undecidable. As far as I can tell, the spec distinguishes a lot of these cases only by means of the type of the value. For example: <contextually typed row value expression> can be a <row value special case> which is a <nonparenthesized value expression primary> which must have a row type. (Note that for example (select row(1,2)) is a nonparenthesized value expression primary even though it has parens around it.) Or <contextually typed row value constructor> can be a <common value expression> which can be one of several <$type value expression> productions all of which contain <nonparenthesized value expression primary> as one of the possible expansions. So for example (select 1) ends up being treated as row((select 1)). Tom> And I don't think we want a situation in which adding "extra" Tom> parens changes a legal command into an illegal one. To some extent I agree, but the spec does rely on this in places: Tom> For example, suppose f(...) returns a single-column tuple result. Tom> This should be legal, if x matches the type of the single column: Tom> update ... set (x) = f(...) Tom> but if we try to do what you seem to have in mind, this would not Tom> be: Tom> update ... set (x) = (f(...)) The spec indeed says that this is not valid, since it would wrap an additional ROW() around the result, and would try and assign the row value to x. Tom> That's sufficiently brain-dead that I don't think we want to go Tom> there. Maybe so, but not without discussion. -- Andrew (irc:RhodiumToad)
Re: BUG #15237: I got "ERROR: source for a multiple-column UPDATEitem must be a sub-SELECT or ROW() expression"
From
Robert Haas
Date:
On Wed, Jun 13, 2018 at 12:48 AM, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote: > The original patch, a self-described "lazy-sunday project", made > absolutely no mention of any compatibility issue, simply discussed the > new options it was allowing, and got no feedback or review. Had it > mentioned or even hinted that existing queries might be broken, I would > hope that there would have been more discussion at the time. I think that is a fair criticism. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: BUG #15237: I got "ERROR: source for a multiple-column UPDATEitem must be a sub-SELECT or ROW() expression"
From
Justin Pryzby
Date:
On Wed, Jun 13, 2018 at 05:48:50AM +0100, Andrew Gierth wrote: > It then apparently went unnoticed until after the release of pg 10, at > which point it got retroactively documented (in the release notes and > nowhere else), in response to a brief discussion of a user complaint > that happened on -general and not -hackers (or even -bugs). Actually I noticed and pointed out during beta; https://www.postgresql.org/message-id/flat/20170719174507.GA19616%40telsasoft.com#20170719174507.GA19616@telsasoft.com Justin
Re: BUG #15237: I got "ERROR: source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression"
From
Andrew Gierth
Date:
>>>>> "Justin" == Justin Pryzby <pryzby@telsasoft.com> writes: >> It then apparently went unnoticed until after the release of pg 10, >> at which point it got retroactively documented (in the release notes >> and nowhere else), in response to a brief discussion of a user >> complaint that happened on -general and not -hackers (or even >> -bugs). Justin> Actually I noticed and pointed out during beta; Justin> https://www.postgresql.org/message-id/flat/20170719174507.GA19616%40telsasoft.com#20170719174507.GA19616@telsasoft.com Interesting, I missed that when searching. So that tells us: 1. Breaking SET (col) = (val) wasn't actually intended or noticed in the original patch; 2. The break was then defended based on an incorrect reading of the spec (as I pointed out already on -bugs, the syntax _is_ legal in the spec with only one column, for just the same reasons that VALUES (1) isn't required to be VALUES row(1); the spec adds the ROW( ) construct itself in one of the syntax rules, in a way that's easy to overlook) -- Andrew (irc:RhodiumToad)
Re: BUG #15237: I got "ERROR: source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression"
From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > Tom> As far as (a) goes, it was an intentional compatibility breakage, > So here I have a problem: the fact that it was a compatibility breakage > seems not to have been discussed *or even so much as mentioned* on > -hackers at any time. I will freely admit that at the time I did the original patch, I did not think that the change for the single-column case was of interest to anyone. Who'd write an extra four parentheses that they didn't need? But it *was* discussed later, including on -hackers: https://www.postgresql.org/message-id/flat/20170719174507.GA19616%40telsasoft.com#20170719174507.GA19616@telsasoft.com https://www.postgresql.org/message-id/flat/CAMjNa7cDLzPcs0xnRpkvqmJ6Vb6G3EH8CYGp9ZBjXdpFfTz6dg%40mail.gmail.com > Tom> For example, suppose f(...) returns a single-column tuple result. > Tom> This should be legal, if x matches the type of the single column: > Tom> update ... set (x) = f(...) > Tom> but if we try to do what you seem to have in mind, this would not > Tom> be: > Tom> update ... set (x) = (f(...)) > The spec indeed says that this is not valid, since it would wrap an > additional ROW() around the result, and would try and assign the row > value to x. I think that arguing that the spec requires that is fairly shaky, and implementing it would be even shakier; for example, we'd have to start worrying about whether ruleutils.c's habit of throwing in extra parens when in doubt could ever result in unexpected change to the meaning. I'd also point out that with the rule that the extra parens implicitly mean "ROW()", it's fairly unclear what should happen with update ... set (x) = ((select ...)) If somebody were to hold a gun to my head and say "you must make this work", I'd probably do something like the attached, which abuses the operator_precedence_warning infrastructure to detect whether there were extra parens. One would have to argue about exactly what it should do with parens around ROW() or (SELECT ...), but I chose to ignore them. (I think that in the SELECT case, the grammar would actually absorb the extra parens elsewhere, so that we couldn't do anything differently anyway without further kluges.) But I repeat that this is a bad idea and we'd regret it in the long run; treating extra parens as meaning ROW() in some cases is just a fundamentally unsound idea from both syntactic and semantic standpoints. Given the extremely small number of complaints since v10 came out, I think we should just stay with what we have. regards, tom lane PS: I noticed while messing with this that there's a separate problem: right now, turning on operator_precedence_warning causes unexpected failures with extra parens around the RHS. The code added below to make transformMultiAssignRef look through AEXPR_PAREN nodes before deciding anything is needed anyway to fix that. diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 90dfac2..64bd254 100644 *** a/src/backend/parser/gram.y --- b/src/backend/parser/gram.y *************** static Node *makeRecursiveViewSelect(cha *** 470,476 **** %type <node> def_arg columnElem where_clause where_or_current_clause a_expr b_expr c_expr AexprConst indirection_el opt_slice_bound columnref in_expr having_clause func_table xmltable array_expr ! ExclusionWhereClause operator_def_arg %type <list> rowsfrom_item rowsfrom_list opt_col_def_list %type <boolean> opt_ordinality %type <list> ExclusionConstraintList ExclusionConstraintElem --- 470,476 ---- %type <node> def_arg columnElem where_clause where_or_current_clause a_expr b_expr c_expr AexprConst indirection_el opt_slice_bound columnref in_expr having_clause func_table xmltable array_expr ! ct_row_expr ExclusionWhereClause operator_def_arg %type <list> rowsfrom_item rowsfrom_list opt_col_def_list %type <boolean> opt_ordinality %type <list> ExclusionConstraintList ExclusionConstraintElem *************** set_clause: *** 11070,11076 **** $1->val = (Node *) $3; $$ = list_make1($1); } ! | '(' set_target_list ')' '=' a_expr { int ncolumns = list_length($2); int i = 1; --- 11070,11076 ---- $1->val = (Node *) $3; $$ = list_make1($1); } ! | '(' set_target_list ')' '=' ct_row_expr { int ncolumns = list_length($2); int i = 1; *************** c_expr: columnref { $$ = $1; } *** 13451,13457 **** n->indirection = check_indirection($4, yyscanner); $$ = (Node *)n; } ! else if (operator_precedence_warning) { /* * If precedence warnings are enabled, insert --- 13451,13458 ---- n->indirection = check_indirection($4, yyscanner); $$ = (Node *)n; } ! else if (operator_precedence_warning || ! pg_yyget_extra(yyscanner)->in_ctre) { /* * If precedence warnings are enabled, insert *************** c_expr: columnref { $$ = $1; } *** 13469,13474 **** --- 13470,13479 ---- * suppress warnings, it's probably safe enough; and * we'd just as soon not waste cycles on dummy parse * nodes if we don't have to. + * + * We also insert AEXPR_PAREN nodes if we're inside a + * ct_row_expr, so that parse analysis can distinguish + * "(x)" from "x" later. */ $$ = (Node *) makeA_Expr(AEXPR_PAREN, NIL, $2, NULL, exprLocation($2)); *************** opt_asymmetric: ASYMMETRIC *** 14599,14604 **** --- 14604,14623 ---- | /*EMPTY*/ ; + /* + * Deal with the spec's poorly-thought-through "contextually typed row value + * expression" syntax. For that, we need to detect whether the expression has + * outer parentheses. We can't define it as "a_expr | (a_expr)" without + * causing reduce/reduce conflicts, so instead force AEXPR_PAREN nodes to be + * inserted for parens throughout the subexpression, and we'll sort it out + * during parse analysis. + */ + ct_row_expr: + { pg_yyget_extra(yyscanner)->in_ctre++; } + a_expr + { pg_yyget_extra(yyscanner)->in_ctre--; $$ = $2; } + ; + /***************************************************************************** * *************** void *** 16328,16331 **** --- 16347,16351 ---- parser_init(base_yy_extra_type *yyext) { yyext->parsetree = NIL; /* in case grammar forgets to set it */ + yyext->in_ctre = 0; /* not handling ct_row_expr */ } diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c index 385e54a..0d09d7b 100644 *** a/src/backend/parser/parse_expr.c --- b/src/backend/parser/parse_expr.c *************** transformMultiAssignRef(ParseState *psta *** 1499,1515 **** /* We only need to transform the source if this is the first column */ if (maref->colno == 1) { /* * For now, we only allow EXPR SubLinks and RowExprs as the source of * an UPDATE multiassignment. This is sufficient to cover interesting * cases; at worst, someone would have to write (SELECT * FROM expr) * to expand a composite-returning expression of another form. */ ! if (IsA(maref->source, SubLink) && ! ((SubLink *) maref->source)->subLinkType == EXPR_SUBLINK) { /* Relabel it as a MULTIEXPR_SUBLINK */ ! sublink = (SubLink *) maref->source; sublink->subLinkType = MULTIEXPR_SUBLINK; /* And transform it */ sublink = (SubLink *) transformExprRecurse(pstate, --- 1499,1548 ---- /* We only need to transform the source if this is the first column */ if (maref->colno == 1) { + Node *source = maref->source; + bool have_parens = false; + + /* Drill through any AEXPR_PAREN nodes */ + while (source && IsA(source, A_Expr) && + ((A_Expr *) source)->kind == AEXPR_PAREN) + { + have_parens = true; + source = ((A_Expr *) source)->lexpr; + } + + /* + * If we have a single-column assignment, and the RHS had parens but + * isn't a SubLink or RowExpr, pretend it's a single-column RowExpr. + * This is quite broken, as it will cause odd inconsistencies whenever + * we get around to allowing other syntactic cases for the RHS, but + * some people insisted. + */ + if (have_parens && maref->ncolumns == 1 && + !(IsA(source, SubLink) && + ((SubLink *) source)->subLinkType == EXPR_SUBLINK) && + !IsA(source, RowExpr)) + { + RowExpr *r = makeNode(RowExpr); + + r->args = list_make1(source); + r->row_typeid = InvalidOid; + r->colnames = NIL; + r->row_format = COERCE_IMPLICIT_CAST; /* abuse */ + r->location = -1; + source = (Node *) r; + } + /* * For now, we only allow EXPR SubLinks and RowExprs as the source of * an UPDATE multiassignment. This is sufficient to cover interesting * cases; at worst, someone would have to write (SELECT * FROM expr) * to expand a composite-returning expression of another form. */ ! if (IsA(source, SubLink) && ! ((SubLink *) source)->subLinkType == EXPR_SUBLINK) { /* Relabel it as a MULTIEXPR_SUBLINK */ ! sublink = (SubLink *) source; sublink->subLinkType = MULTIEXPR_SUBLINK; /* And transform it */ sublink = (SubLink *) transformExprRecurse(pstate, *************** transformMultiAssignRef(ParseState *psta *** 1542,1552 **** */ sublink->subLinkId = list_length(pstate->p_multiassign_exprs); } ! else if (IsA(maref->source, RowExpr)) { /* Transform the RowExpr, allowing SetToDefault items */ rexpr = (RowExpr *) transformRowExpr(pstate, ! (RowExpr *) maref->source, true); /* Check it returns required number of columns */ --- 1575,1585 ---- */ sublink->subLinkId = list_length(pstate->p_multiassign_exprs); } ! else if (IsA(source, RowExpr)) { /* Transform the RowExpr, allowing SetToDefault items */ rexpr = (RowExpr *) transformRowExpr(pstate, ! (RowExpr *) source, true); /* Check it returns required number of columns */ *************** transformMultiAssignRef(ParseState *psta *** 1568,1574 **** ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression"), ! parser_errposition(pstate, exprLocation(maref->source)))); } else { --- 1601,1607 ---- ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression"), ! parser_errposition(pstate, exprLocation(source)))); } else { diff --git a/src/include/parser/gramparse.h b/src/include/parser/gramparse.h index 42e7ede..6e51081 100644 *** a/src/include/parser/gramparse.h --- b/src/include/parser/gramparse.h *************** typedef struct base_yy_extra_type *** 53,58 **** --- 53,59 ---- * State variables that belong to the grammar. */ List *parsetree; /* final parse result is delivered here */ + int in_ctre; /* handling contextually-typed row expr? */ } base_yy_extra_type; /*
Re: BUG #15237: I got "ERROR: source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression"
From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> For example, suppose f(...) returns a single-column tuple result. Tom> This should be legal, if x matches the type of the single column: Tom> update ... set (x) = f(...) Tom> but if we try to do what you seem to have in mind, this would not Tom> be: Tom> update ... set (x) = (f(...)) >> The spec indeed says that this is not valid, since it would wrap an >> additional ROW() around the result, and would try and assign the row >> value to x. Tom> I think that arguing that the spec requires that is fairly shaky, I'll be more precise: the spec does not define any syntax that would match UPDATE ... SET (x) = (f(...)) where f() is a function returning a row type of degree 1 (or any other degree for that matter). Specifically, the expansions of <contextually typed row value constructor> don't allow a row-valued function to appear alone that way, while the alternative, <row value special case>, requires a <nonparenthesized value expression primary>. The spec accordingly doesn't require that the construct be rejected, it could be accepted as a (nonstandard) extension. Tom> I'd also point out that with the rule that the extra parens Tom> implicitly mean "ROW()", it's fairly unclear what should happen Tom> with Tom> update ... set (x) = ((select ...)) The spec doesn't say that parens (even extra parens) implicitly mean ROW(), what it does say is that some branches of the syntax implicitly add a ROW() and some do not. Specifically, if the thing to the right of the = is a <common value expression>, <boolean value expression>, or NULL/DEFAULT without parens, then ROW() is added. ((select scalarcol from ...)) is a <common value expression>, ((select ROW(scalarcol) from ...)) is, however, not. Tom> But I repeat that this is a bad idea and we'd regret it in the Tom> long run; treating extra parens as meaning ROW() in some cases is Tom> just a fundamentally unsound idea from both syntactic and semantic Tom> standpoints. But that's not actually the requirement. Parens are significant to the spec, but that doesn't mean they have to be significant to us in the same way as long as we end up accepting all the spec's cases (or at least the useful ones). Here is a real problem case (where "rowcol" is a column with a row type and "rowval" has that same type): UPDATE ... SET (rowcol) = (rowval) There literally isn't any way to write the above in the spec except as UPDATE ... SET (rowcol) = ROW(rowval) or UPDATE ... SET (rowcol) = (select ROW(rowval) from ...) If we don't try and handle SET (rowcol) = (rowval), then I think we can make all the spec's cases work by this rule: if the thing on the RHS of the assignment is not a row type, then treat it as a row type of degree 1. That works without needing special rules for parens, and while it accepts a lot of possibilities that the spec rejects, I don't see any problems there. Tom> Given the extremely small number of complaints since v10 came out, Tom> I think we should just stay with what we have. I tried looking at what combinations were supported by other major databases: UPDATE ... SET (cols...) = (exprs...) Supported by DB2 and SQLite, both of which allow 1 or more expressions. Not supported by MySQL, MSSQL, Oracle. UPDATE ... SET (cols...) = (select exprs...) Supported by DB2, Oracle, SQLite, for 1 or more columns. Not supported by MySQL, MSSQL. (Note that this syntax is apparently not valid in the spec except for the case of a single non-rowtype column!) UPDATE ... SET (cols...) = (DEFAULT,...) Supported by DB2, for 1 or more columns. Not supported by MySQL, MSSQL, Oracle, SQLite. UPDATE ... SET (cols...) = ROW(...) UPDATE ... SET (cols...) = nonparenthesized_row_expression UPDATE ... SET (cols...) = (select ROW(...) from ...) Not supported by anything that I could find. (HSQLDB claims to support the full standard syntax, but I don't believe it yet.) -- Andrew (irc:RhodiumToad)