Re: Support UPDATE table SET(*)=... - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Support UPDATE table SET(*)=... |
Date | |
Msg-id | 24065.1428425624@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Support UPDATE table SET(*)=... (Atri Sharma <atri.jiit@gmail.com>) |
Responses |
Re: Support UPDATE table SET(*)=...
Re: Support UPDATE table SET(*)=... Re: Support UPDATE table SET(*)=... |
List | pgsql-hackers |
Atri Sharma <atri.jiit@gmail.com> writes: > Please find attached latest version of UPDATE (*) patch.The patch > implements review comments and Tom's gripe about touching > transformTargetList is addressed now. I have added regression tests and > simplified parser representation a bit. I spent a fair amount of time cleaning this patch up to get it into committable shape, but as I was working on the documentation I started to lose enthusiasm for it, because I was having a hard time coming up with compelling examples. The originally proposed motivation was >> It solves the problem of doing UPDATE from a record variable of the same >> type as the table e.g. update foo set (*) = (select foorec.*) where ...; but it feels to me that this is not actually a very good solution to that problem --- even granting that the problem needs to be solved, a claim that still requires some justification IMO. Here is a possible use-case: regression=# create table src (f1 int, f2 text, f3 real); CREATE TABLE regression=# create table dst (f1 int, f2 text, f3 real); CREATE TABLE regression=# create rule r1 as on update to src do instead regression-# update dst set (*) = (new.*) where dst.f1 = old.f1; ERROR: number of columns does not match number of values LINE 2: update dst set (*) = (new.*) where dst.f1 = old.f1; ^ So that's annoying. You can work around it with this very unobvious (and undocumented) syntax hack: regression=# create rule r1 as on update to src do instead regression-# update dst set (*) = (select new.*) where dst.f1 = old.f1; CREATE RULE But what happens if dst has no matching row? Your data goes into the bit bucket, that's what. What you really wanted here was some kind of UPSERT. There's certainly a possible use-case for a notation like this in UPSERT, when we get around to implementing that; but I'm less than convinced that we need it in plain UPDATE. What's much worse though is that the rule that actually gets stored is: regression=# \d+ src Table "public.src" Column | Type | Modifiers | Storage | Stats target | Description --------+---------+-----------+----------+--------------+------------- f1 | integer | | plain | | f2 | text | | extended | | f3 | real | | plain | | Rules: r1 AS ON UPDATE TO src DO INSTEAD UPDATE dst SET (f1, f2, f3) = ( SELECT new.f1, new.f2, new.f3) WHERE dst.f1 = old.f1 That is, you don't actually have any protection at all against future additions of columns, which seems like it would be most of the point of using a notation like this. We could possibly address that by postponing expansion of the *-notation to rule rewrite time, but that seems like it'd be a complete disaster in terms of modularity, or even usability (since column-mismatch errors wouldn't get raised till then either). And it'd certainly be a far more invasive and complex patch than this. So I'm feeling that this may not be a good idea, or at least not a good implementation of the idea. I'm inclined to reject the patch rather than lock us into something that is not standard and doesn't really do what people would be likely to want. Attached is the updated patch that I had before arriving at this discouraging conclusion. regards, tom lane diff --git a/doc/src/sgml/ref/update.sgml b/doc/src/sgml/ref/update.sgml index 35b0699..adb4473 100644 *** a/doc/src/sgml/ref/update.sgml --- b/doc/src/sgml/ref/update.sgml *************** PostgreSQL documentation *** 25,31 **** UPDATE [ ONLY ] <replaceable class="PARAMETER">table_name</replaceable> [ * ] [ [ AS ] <replaceable class="parameter">alias</replaceable>] SET { <replaceable class="PARAMETER">column_name</replaceable> = { <replaceable class="PARAMETER">expression</replaceable>| DEFAULT } | ( <replaceable class="PARAMETER">column_name</replaceable> [, ...] ) = ( { <replaceable class="PARAMETER">expression</replaceable>| DEFAULT } [, ...] ) | ! ( <replaceable class="PARAMETER">column_name</replaceable> [, ...] ) = ( <replaceable class="PARAMETER">sub-SELECT</replaceable>) } [, ...] [ FROM <replaceable class="PARAMETER">from_list</replaceable> ] [ WHERE <replaceable class="PARAMETER">condition</replaceable> | WHERE CURRENT OF <replaceable class="PARAMETER">cursor_name</replaceable>] --- 25,33 ---- UPDATE [ ONLY ] <replaceable class="PARAMETER">table_name</replaceable> [ * ] [ [ AS ] <replaceable class="parameter">alias</replaceable>] SET { <replaceable class="PARAMETER">column_name</replaceable> = { <replaceable class="PARAMETER">expression</replaceable>| DEFAULT } | ( <replaceable class="PARAMETER">column_name</replaceable> [, ...] ) = ( { <replaceable class="PARAMETER">expression</replaceable>| DEFAULT } [, ...] ) | ! ( <replaceable class="PARAMETER">column_name</replaceable> [, ...] ) = ( <replaceable class="PARAMETER">sub-SELECT</replaceable>) | ! (*) = ( { <replaceable class="PARAMETER">expression</replaceable> | DEFAULT } [, ...] ) | ! (*) = ( <replaceable class="PARAMETER">sub-SELECT</replaceable> ) } [, ...] [ FROM <replaceable class="PARAMETER">from_list</replaceable> ] [ WHERE <replaceable class="PARAMETER">condition</replaceable> | WHERE CURRENT OF <replaceable class="PARAMETER">cursor_name</replaceable>] *************** UPDATE [ ONLY ] <replaceable class="PARA *** 164,169 **** --- 166,181 ---- </varlistentry> <varlistentry> + <term><literal>(*)</literal></term> + <listitem> + <para> + In the <literal>SET</> clause, <literal>(*)</literal> stands for a + parenthesized column list of all the table's columns, in order. + </para> + </listitem> + </varlistentry> + + <varlistentry> <term><replaceable class="PARAMETER">from_list</replaceable></term> <listitem> <para> *************** UPDATE summary s SET (sum_x, sum_y, avg_ *** 374,379 **** --- 386,398 ---- </para> <para> + Update complete rows from another table having the same set of columns: + <programlisting> + UPDATE dest_table d SET (*) = (SELECT * FROM src_table s WHERE s.id = d.id); + </programlisting> + </para> + + <para> Attempt to insert a new stock item along with the quantity of stock. If the item already exists, instead update the stock count of the existing item. To do this without failing the entire transaction, use savepoints: *************** UPDATE films SET kind = 'Dramatic' WHERE *** 407,413 **** This command conforms to the <acronym>SQL</acronym> standard, except that the <literal>FROM</literal> and <literal>RETURNING</> clauses are <productname>PostgreSQL</productname> extensions, as is the ability ! to use <literal>WITH</> with <command>UPDATE</>. </para> <para> --- 426,434 ---- This command conforms to the <acronym>SQL</acronym> standard, except that the <literal>FROM</literal> and <literal>RETURNING</> clauses are <productname>PostgreSQL</productname> extensions, as is the ability ! to use <literal>WITH</> with <command>UPDATE</>. Also, the <literal>(*)</> ! syntax to specify all columns as a multiple-assignment target is a ! <productname>PostgreSQL</productname> extension. </para> <para> *************** UPDATE films SET kind = 'Dramatic' WHERE *** 419,427 **** </para> <para> ! According to the standard, the source value for a parenthesized sub-list of ! column names can be any row-valued expression yielding the correct number ! of columns. <productname>PostgreSQL</productname> only allows the source value to be a parenthesized list of expressions (a row constructor) or a sub-<literal>SELECT</>. An individual column's updated value can be specified as <literal>DEFAULT</> in the row-constructor case, but not --- 440,449 ---- </para> <para> ! According to the standard, the source value for a multiple-assignment ! target (that is, a parenthesized sub-list of column names) can be any ! row-valued expression yielding the correct number of ! columns. <productname>PostgreSQL</productname> only allows the source value to be a parenthesized list of expressions (a row constructor) or a sub-<literal>SELECT</>. An individual column's updated value can be specified as <literal>DEFAULT</> in the row-constructor case, but not diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index 4a5a520..901bd14 100644 *** a/src/backend/parser/analyze.c --- b/src/backend/parser/analyze.c *************** transformUpdateStmt(ParseState *pstate, *** 1899,1906 **** --- 1899,1908 ---- { Query *qry = makeNode(Query); ParseNamespaceItem *nsitem; + Relation target_rel; RangeTblEntry *target_rte; Node *qual; + List *targetList; ListCell *origTargetList; ListCell *tl; *************** transformUpdateStmt(ParseState *pstate, *** 1919,1924 **** --- 1921,1927 ---- interpretInhOption(stmt->relation->inhOpt), true, ACL_UPDATE); + target_rel = pstate->p_target_relation; /* grab the namespace item made by setTargetTable */ nsitem = (ParseNamespaceItem *) llast(pstate->p_namespace); *************** transformUpdateStmt(ParseState *pstate, *** 1937,1943 **** nsitem->p_lateral_only = false; nsitem->p_lateral_ok = true; ! qry->targetList = transformTargetList(pstate, stmt->targetList, EXPR_KIND_UPDATE_SOURCE); qual = transformWhereClause(pstate, stmt->whereClause, --- 1940,2025 ---- nsitem->p_lateral_only = false; nsitem->p_lateral_ok = true; ! /* ! * Before starting the main processing of the target list, we must expand ! * any "(*)" multicolumn target clauses, replacing them with the actual ! * column names of the result relation. Since most UPDATEs won't use that ! * feature, make a quick check to see if it's used and avoid expending the ! * cycles to rebuild the target list if not. ! */ ! targetList = stmt->targetList; ! ! foreach(tl, targetList) ! { ! ResTarget *res = (ResTarget *) lfirst(tl); ! ! if (res->name == NULL) ! break; ! } ! if (tl) ! { ! /* Yes, we have "(*)" ... expand the targetlist */ ! List *newTargetList = NIL; ! ! foreach(tl, targetList) ! { ! ResTarget *res = (ResTarget *) lfirst(tl); ! int i, ! j, ! ncolumns; ! ! /* If not "(*)", just keep it in the list */ ! if (res->name != NULL) ! { ! newTargetList = lappend(newTargetList, res); ! continue; ! } ! ! /* ! * Expand "(*)" into actual column names. First we must count the ! * number of non-dropped target columns. ! */ ! ncolumns = 0; ! for (i = 0; i < target_rel->rd_rel->relnatts; i++) ! { ! Form_pg_attribute att = target_rel->rd_att->attrs[i]; ! ! if (att->attisdropped) ! continue; ! ! ncolumns++; ! } ! j = 0; ! for (i = 0; i < target_rel->rd_rel->relnatts; i++) ! { ! Form_pg_attribute att = target_rel->rd_att->attrs[i]; ! MultiAssignRef *r; ! ResTarget *newres; ! ! if (att->attisdropped) ! continue; ! ! /* Make a MultiAssignRef for each column */ ! r = makeNode(MultiAssignRef); ! r->source = res->val; ! r->colno = ++j; ! r->ncolumns = ncolumns; ! ! /* ... and a ResTarget */ ! newres = makeNode(ResTarget); ! newres->name = pstrdup(NameStr(att->attname)); ! newres->indirection = NIL; ! newres->val = (Node *) r; ! newres->location = res->location; ! ! newTargetList = lappend(newTargetList, newres); ! } ! } ! targetList = newTargetList; ! } ! ! /* Now we can run the main processing of the target list */ ! qry->targetList = transformTargetList(pstate, targetList, EXPR_KIND_UPDATE_SOURCE); qual = transformWhereClause(pstate, stmt->whereClause, *************** transformUpdateStmt(ParseState *pstate, *** 1956,1967 **** */ /* Prepare to assign non-conflicting resnos to resjunk attributes */ ! if (pstate->p_next_resno <= pstate->p_target_relation->rd_rel->relnatts) ! pstate->p_next_resno = pstate->p_target_relation->rd_rel->relnatts + 1; /* Prepare non-junk columns for assignment to target table */ target_rte = pstate->p_target_rangetblentry; ! origTargetList = list_head(stmt->targetList); foreach(tl, qry->targetList) { --- 2038,2049 ---- */ /* Prepare to assign non-conflicting resnos to resjunk attributes */ ! if (pstate->p_next_resno <= target_rel->rd_rel->relnatts) ! pstate->p_next_resno = target_rel->rd_rel->relnatts + 1; /* Prepare non-junk columns for assignment to target table */ target_rte = pstate->p_target_rangetblentry; ! origTargetList = list_head(targetList); foreach(tl, qry->targetList) { *************** transformUpdateStmt(ParseState *pstate, *** 1986,1999 **** origTarget = (ResTarget *) lfirst(origTargetList); Assert(IsA(origTarget, ResTarget)); ! attrno = attnameAttNum(pstate->p_target_relation, ! origTarget->name, true); if (attrno == InvalidAttrNumber) ereport(ERROR, (errcode(ERRCODE_UNDEFINED_COLUMN), errmsg("column \"%s\" of relation \"%s\" does not exist", origTarget->name, ! RelationGetRelationName(pstate->p_target_relation)), parser_errposition(pstate, origTarget->location))); updateTargetListEntry(pstate, tle, origTarget->name, --- 2068,2080 ---- origTarget = (ResTarget *) lfirst(origTargetList); Assert(IsA(origTarget, ResTarget)); ! attrno = attnameAttNum(target_rel, origTarget->name, true); if (attrno == InvalidAttrNumber) ereport(ERROR, (errcode(ERRCODE_UNDEFINED_COLUMN), errmsg("column \"%s\" of relation \"%s\" does not exist", origTarget->name, ! RelationGetRelationName(target_rel)), parser_errposition(pstate, origTarget->location))); updateTargetListEntry(pstate, tle, origTarget->name, diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 88ec83c..56d2af5 100644 *** a/src/backend/parser/gram.y --- b/src/backend/parser/gram.y *************** single_set_clause: *** 9536,9561 **** multiple_set_clause: '(' set_target_list ')' '=' ctext_row { ListCell *col_cell; - ListCell *val_cell; /* ! * Break the ctext_row apart, merge individual expressions ! * into the destination ResTargets. This is semantically ! * equivalent to, and much cheaper to process than, the ! * general case. */ ! if (list_length($2) != list_length($5)) ! ereport(ERROR, ! (errcode(ERRCODE_SYNTAX_ERROR), ! errmsg("number of columns does not match number of values"), ! parser_errposition(@5))); ! forboth(col_cell, $2, val_cell, $5) { ResTarget *res_col = (ResTarget *) lfirst(col_cell); ! Node *res_val = (Node *) lfirst(val_cell); ! res_col->val = res_val; } $$ = $2; --- 9536,9562 ---- multiple_set_clause: '(' set_target_list ')' '=' ctext_row { + int ncolumns = list_length($2); + int i = 1; ListCell *col_cell; /* ! * We create a MultiAssignRef source for each target, all ! * of them pointing to the ctext_row List. The ctext_row ! * will get split apart during parse analysis. (We could ! * split it apart here, but it's simpler to share code ! * with the "(*)" target case this way.) */ ! foreach(col_cell, $2) { ResTarget *res_col = (ResTarget *) lfirst(col_cell); ! MultiAssignRef *r = makeNode(MultiAssignRef); ! r->source = (Node *) $5; ! r->colno = i; ! r->ncolumns = ncolumns; ! res_col->val = (Node *) r; ! i++; } $$ = $2; *************** multiple_set_clause: *** 9590,9595 **** --- 9591,9635 ---- $$ = $2; } + | '(' '*' ')' '=' ctext_row + { + ResTarget *res_col = makeNode(ResTarget); + + /* + * Create a ResTarget with null name to represent "(*)". + * We don't bother with MultiAssignRef yet. + */ + res_col->name = NULL; + res_col->indirection = NIL; + res_col->val = (Node *) $5; + res_col->location = @2; + + $$ = list_make1(res_col); + } + | '(' '*' ')' '=' select_with_parens + { + SubLink *sl = makeNode(SubLink); + ResTarget *res_col = makeNode(ResTarget); + + /* First, convert bare SelectStmt into a SubLink */ + sl->subLinkType = MULTIEXPR_SUBLINK; + sl->subLinkId = 0; /* will be assigned later */ + sl->testexpr = NULL; + sl->operName = NIL; + sl->subselect = $5; + sl->location = @5; + + /* + * Create a ResTarget with null name to represent "(*)". + * We don't bother with MultiAssignRef yet. + */ + res_col->name = NULL; + res_col->indirection = NIL; + res_col->val = (Node *) sl; + res_col->location = @2; + + $$ = list_make1(res_col); + } ; set_target: diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c index f759606..c5ce95f 100644 *** a/src/backend/parser/parse_expr.c --- b/src/backend/parser/parse_expr.c *************** transformMultiAssignRef(ParseState *psta *** 1446,1451 **** --- 1446,1470 ---- /* We should only see this in first-stage processing of UPDATE tlists */ Assert(pstate->p_expr_kind == EXPR_KIND_UPDATE_SOURCE); + /* + * The grammar should only generate Lists or MULTIEXPR SubLinks as inputs + * to a MultiAssignRef. If it's a List, we need only pick out the n'th + * list element and transform that. + */ + if (IsA(maref->source, List)) + { + List *slist = (List *) maref->source; + Node *source; + + if (list_length(slist) != maref->ncolumns) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("number of columns does not match number of values"), + parser_errposition(pstate, exprLocation(maref->source)))); + source = (Node *) list_nth(slist, maref->colno - 1); + return transformExprRecurse(pstate, source); + } + /* We only need to transform the source if this is the first column */ if (maref->colno == 1) { diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index 0e257ac..6456f02 100644 *** a/src/include/nodes/parsenodes.h --- b/src/include/nodes/parsenodes.h *************** typedef struct A_ArrayExpr *** 400,406 **** * * In an UPDATE target list, 'name' is the name of the destination column, * 'indirection' stores any subscripts attached to the destination, and ! * 'val' is the expression to assign. * * See A_Indirection for more info about what can appear in 'indirection'. */ --- 400,408 ---- * * In an UPDATE target list, 'name' is the name of the destination column, * 'indirection' stores any subscripts attached to the destination, and ! * 'val' is the expression to assign. If 'name' is NULL then the ResTarget ! * represents a "(*)" multi-column assignment target (and its 'val' field ! * is either a List of value expressions, or a MULTIEXPR SubLink). * * See A_Indirection for more info about what can appear in 'indirection'. */ diff --git a/src/test/regress/expected/update.out b/src/test/regress/expected/update.out index 1de2a86..554aa35 100644 *** a/src/test/regress/expected/update.out --- b/src/test/regress/expected/update.out *************** SELECT * FROM update_test; *** 129,134 **** --- 129,200 ---- | | (4 rows) + -- + -- Test (*) syntax + -- + INSERT INTO update_test VALUES (1001, 1002); + UPDATE update_test SET (*) = (SELECT a, b, 'foo' FROM update_test WHERE b = 101) + WHERE a = 1001; + SELECT * FROM update_test; + a | b | c + ----+-----+----- + 21 | 101 | + 41 | 12 | car + 42 | 12 | car + | | + 21 | 101 | foo + (5 rows) + + UPDATE update_test SET (*) = (DEFAULT, DEFAULT, 'zap') + WHERE c = 'foo'; + SELECT * FROM update_test; + a | b | c + ----+-----+----- + 21 | 101 | + 41 | 12 | car + 42 | 12 | car + | | + 10 | | zap + (5 rows) + + DELETE FROM update_test WHERE c = 'zap'; + -- error cases (too many/few columns) + UPDATE update_test SET (a, b, c) = (1, 2); + ERROR: number of columns does not match number of values + LINE 1: UPDATE update_test SET (a, b, c) = (1, 2); + ^ + UPDATE update_test SET (a, b, c) = (1, 2, 3, 4); + ERROR: number of columns does not match number of values + LINE 1: UPDATE update_test SET (a, b, c) = (1, 2, 3, 4); + ^ + UPDATE update_test SET (a, b, c) = (SELECT 1, 2 FROM update_test); + ERROR: number of columns does not match number of values + LINE 1: UPDATE update_test SET (a, b, c) = (SELECT 1, 2 FROM update_... + ^ + UPDATE update_test SET (a, b, c) = (SELECT 1, 2, 3, 4 FROM update_test); + ERROR: number of columns does not match number of values + LINE 1: UPDATE update_test SET (a, b, c) = (SELECT 1, 2, 3, 4 FROM u... + ^ + UPDATE update_test SET (*) = (1, 2); + ERROR: number of columns does not match number of values + LINE 1: UPDATE update_test SET (*) = (1, 2); + ^ + UPDATE update_test SET (*) = (1, 2, 3, 4); + ERROR: number of columns does not match number of values + LINE 1: UPDATE update_test SET (*) = (1, 2, 3, 4); + ^ + UPDATE update_test SET (*) = (SELECT 1, 2 FROM update_test); + ERROR: number of columns does not match number of values + LINE 1: UPDATE update_test SET (*) = (SELECT 1, 2 FROM update_test); + ^ + UPDATE update_test SET (*) = (SELECT 1, 2, 3, 4 FROM update_test); + ERROR: number of columns does not match number of values + LINE 1: UPDATE update_test SET (*) = (SELECT 1, 2, 3, 4 FROM update_... + ^ + -- this is a semantic error, but not a syntactic one: + UPDATE update_test SET (*) = (DEFAULT, DEFAULT, DEFAULT), c = 'zap' + WHERE c = 'foo'; + ERROR: multiple assignments to same column "c" -- if an alias for the target table is specified, don't allow references -- to the original table name UPDATE update_test AS t SET b = update_test.b + 10 WHERE t.a = 10; diff --git a/src/test/regress/sql/update.sql b/src/test/regress/sql/update.sql index e71128c..e78de79 100644 *** a/src/test/regress/sql/update.sql --- b/src/test/regress/sql/update.sql *************** UPDATE update_test SET (b,a) = (select a *** 66,71 **** --- 66,101 ---- WHERE a = 11; SELECT * FROM update_test; + -- + -- Test (*) syntax + -- + INSERT INTO update_test VALUES (1001, 1002); + + UPDATE update_test SET (*) = (SELECT a, b, 'foo' FROM update_test WHERE b = 101) + WHERE a = 1001; + + SELECT * FROM update_test; + + UPDATE update_test SET (*) = (DEFAULT, DEFAULT, 'zap') + WHERE c = 'foo'; + + SELECT * FROM update_test; + + DELETE FROM update_test WHERE c = 'zap'; + + -- error cases (too many/few columns) + UPDATE update_test SET (a, b, c) = (1, 2); + UPDATE update_test SET (a, b, c) = (1, 2, 3, 4); + UPDATE update_test SET (a, b, c) = (SELECT 1, 2 FROM update_test); + UPDATE update_test SET (a, b, c) = (SELECT 1, 2, 3, 4 FROM update_test); + UPDATE update_test SET (*) = (1, 2); + UPDATE update_test SET (*) = (1, 2, 3, 4); + UPDATE update_test SET (*) = (SELECT 1, 2 FROM update_test); + UPDATE update_test SET (*) = (SELECT 1, 2, 3, 4 FROM update_test); + -- this is a semantic error, but not a syntactic one: + UPDATE update_test SET (*) = (DEFAULT, DEFAULT, DEFAULT), c = 'zap' + WHERE c = 'foo'; + -- if an alias for the target table is specified, don't allow references -- to the original table name UPDATE update_test AS t SET b = update_test.b + 10 WHERE t.a = 10;
pgsql-hackers by date: