Thread: Rethinking plpgsql's assignment implementation
We've had complaints in the past about how plpgsql can't handle assignments to fields in arrays of records [1], that is cases like arrayvar[n].field := something; and also complaints about how plpgsql can't handle assignments to array slices [2], ie arrayvar[m:n] := something; As of commit c7aba7c14, we have another problem, namely that plpgsql's subscripted assignment only works for regular arrays; it won't work for other types that might define subscript assignment handlers. So I started to think about how to fix that, and eventually decided that what we ought to do is nuke plpgsql's array-assignment code altogether. The core code already has support for everything we want here in the context of field/element assignments in UPDATE commands; if we could get plpgsql to make use of that infrastructure instead of rolling its own, we'd be a lot better off. The hard part of that is that the core parser will only generate the structures we need (FieldStores and assignment SubscriptingRefs) within UPDATE commands. We could export the relevant functions (particularly transformAssignmentIndirection); but that won't help plpgsql very much, because it really wants to be able to run all this stuff through SPI. That means we have to have SQL syntax that can generate an expression of that form. That led me to think about introducing a new statement, say SET variable_name opt_indirection := a_expr where opt_indirection is gram.y's symbol for "field selections and/or subscripts". The idea here is that a plpgsql statement like x[2].fld := something; would be parsed using this new statement, producing an expression that uses an assignment SubscriptingRef and a FieldStore operating on a Param that gives the initial value of the array-of-composite variable "x". Then plpgsql would just evaluate this expression and assign the result to x. Problem solved. This almost works as-is, modulo annoying parse conflicts against the existing variants of SET. However there's a nasty little detail about what "variable_name" can be in plpgsql: it can be either one or two identifiers, since there might be a block label involved, eg <<mylabel>> declare x int; begin mylabel.x := ... Between that and the parse-conflict problem, I ended up with this syntax: SET n: variable_name opt_indirection := a_expr where "n" is an integer literal indicating how many dot-separated names should be taken as the base variable name. Another annoying point is that plpgsql historically has allowed fun stuff like mycount := count(*) from my_table where ...; that is, after the expression you can have all the rest of an ordinary SELECT command. That's not terribly hard to deal with, but it means that this new statement has to have all of SELECT's other options too. The other area that doesn't quite work without some kind of hack is that plpgsql's casting rules for which types can be assigned to what are far laxer than what the core parser thinks should be allowed in UPDATE. The cast has to happen within the assignment expression for this to work at all, so plpgsql can't fix it by itself. The solution I adopted was just to invent a new CoercionContext value COERCION_PLPGSQL, representing "use pl/pgsql's rules". (Basically what that means nowadays is to apply CoerceViaIO if assignment cast lookup doesn't find a cast pathway.) A happy side-effect of this approach is that it actually makes some cases faster. In particular I can measure speedups for (a) assignments to subscripted variables and (b) cases where a coercion must be performed to produce the result to be assigned. I believe the reason for this is that the patch effectively merges what had been separate expressions (subscripts or casts, respectively) into the main result-producing expression. This eliminates a nontrivial amount of overhead for plancache validity checking, execution startup, etc. Another side-effect is that the report of the statement in error cases might look different. For example, in v13 a typo in a subscript expression produces regression=# do $$ declare x int[]; begin x[!2] = 43; end $$; ERROR: operator does not exist: ! integer LINE 1: SELECT !2 ^ HINT: No operator matches the given name and argument type. You might need to add an explicit type cast. QUERY: SELECT !2 CONTEXT: PL/pgSQL function inline_code_block line 1 at assignment With this patch, you get regression=# do $$ declare x int[]; begin x[!2] = 43; end $$; ERROR: operator does not exist: ! integer LINE 1: SET 1: x[!2] = 43 ^ HINT: No operator matches the given name and argument type. You might need to add an explicit type cast. QUERY: SET 1: x[!2] = 43 CONTEXT: PL/pgSQL function inline_code_block line 1 at assignment It seems like a clear improvement to me that the whole plpgsql statement is now quoted, but the "SET n:" bit in front of it might confuse people, especially if we don't document this new syntax (which I'm inclined not to, since it's useless in straight SQL). On the other hand, the "SELECT" that you got with the old code was confusing to novices too. Maybe something could be done to suppress those prefixes in error reports? Seems like a matter for another patch. We could also use some other prefix --- there's nothing particularly magic about the word "SET" here, except that it already exists as a keyword --- but I didn't think of anything I liked better. This is still WIP: I've not added any new regression test cases nor looked at the docs, and there's more cleanup needed in plpgsql. But it passes check-world, so I thought I'd put it out for comments. regards, tom lane [1] https://www.postgresql.org/message-id/A3691E98-CCA5-4DEB-B43C-92AD0437E09E%40mikatiming.de [2] https://www.postgresql.org/message-id/1070.1451345954%40sss.pgh.pa.us diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c index c3ce480c8f..edea27697e 100644 --- a/src/backend/commands/functioncmds.c +++ b/src/backend/commands/functioncmds.c @@ -1628,6 +1628,7 @@ CreateCast(CreateCastStmt *stmt) case COERCION_ASSIGNMENT: castcontext = COERCION_CODE_ASSIGNMENT; break; + /* COERCION_PLPGSQL is intentionally not covered here */ case COERCION_EXPLICIT: castcontext = COERCION_CODE_EXPLICIT; break; diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 70f8b718e0..67dab37a54 100644 --- a/src/backend/nodes/copyfuncs.c +++ b/src/backend/nodes/copyfuncs.c @@ -3199,6 +3199,29 @@ _copySetOperationStmt(const SetOperationStmt *from) return newnode; } +static ExtensionSetStmt * +_copyExtensionSetStmt(const ExtensionSetStmt *from) +{ + ExtensionSetStmt *newnode = makeNode(ExtensionSetStmt); + + COPY_NODE_FIELD(target); + COPY_STRING_FIELD(name); + COPY_NODE_FIELD(indirection); + COPY_NODE_FIELD(val); + COPY_NODE_FIELD(fromClause); + COPY_NODE_FIELD(whereClause); + COPY_NODE_FIELD(groupClause); + COPY_NODE_FIELD(havingClause); + COPY_NODE_FIELD(windowClause); + COPY_NODE_FIELD(sortClause); + COPY_NODE_FIELD(limitOffset); + COPY_NODE_FIELD(limitCount); + COPY_SCALAR_FIELD(limitOption); + COPY_NODE_FIELD(lockingClause); + + return newnode; +} + static AlterTableStmt * _copyAlterTableStmt(const AlterTableStmt *from) { @@ -5220,6 +5243,9 @@ copyObjectImpl(const void *from) case T_SetOperationStmt: retval = _copySetOperationStmt(from); break; + case T_ExtensionSetStmt: + retval = _copyExtensionSetStmt(from); + break; case T_AlterTableStmt: retval = _copyAlterTableStmt(from); break; diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c index 541e0e6b48..28884da7ec 100644 --- a/src/backend/nodes/equalfuncs.c +++ b/src/backend/nodes/equalfuncs.c @@ -1085,6 +1085,27 @@ _equalSetOperationStmt(const SetOperationStmt *a, const SetOperationStmt *b) return true; } +static bool +_equalExtensionSetStmt(const ExtensionSetStmt *a, const ExtensionSetStmt *b) +{ + COMPARE_NODE_FIELD(target); + COMPARE_STRING_FIELD(name); + COMPARE_NODE_FIELD(indirection); + COMPARE_NODE_FIELD(val); + COMPARE_NODE_FIELD(fromClause); + COMPARE_NODE_FIELD(whereClause); + COMPARE_NODE_FIELD(groupClause); + COMPARE_NODE_FIELD(havingClause); + COMPARE_NODE_FIELD(windowClause); + COMPARE_NODE_FIELD(sortClause); + COMPARE_NODE_FIELD(limitOffset); + COMPARE_NODE_FIELD(limitCount); + COMPARE_SCALAR_FIELD(limitOption); + COMPARE_NODE_FIELD(lockingClause); + + return true; +} + static bool _equalAlterTableStmt(const AlterTableStmt *a, const AlterTableStmt *b) { @@ -3275,6 +3296,9 @@ equal(const void *a, const void *b) case T_SetOperationStmt: retval = _equalSetOperationStmt(a, b); break; + case T_ExtensionSetStmt: + retval = _equalExtensionSetStmt(a, b); + break; case T_AlterTableStmt: retval = _equalAlterTableStmt(a, b); break; diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index 963f71e99d..166fbeddb2 100644 --- a/src/backend/nodes/nodeFuncs.c +++ b/src/backend/nodes/nodeFuncs.c @@ -3669,6 +3669,36 @@ raw_expression_tree_walker(Node *node, return true; } break; + case T_ExtensionSetStmt: + { + ExtensionSetStmt *stmt = (ExtensionSetStmt *) node; + + if (walker(stmt->target, context)) + return true; + if (walker(stmt->indirection, context)) + return true; + if (walker(stmt->val, context)) + return true; + if (walker(stmt->fromClause, context)) + return true; + if (walker(stmt->whereClause, context)) + return true; + if (walker(stmt->groupClause, context)) + return true; + if (walker(stmt->havingClause, context)) + return true; + if (walker(stmt->windowClause, context)) + return true; + if (walker(stmt->sortClause, context)) + return true; + if (walker(stmt->limitOffset, context)) + return true; + if (walker(stmt->limitCount, context)) + return true; + if (walker(stmt->lockingClause, context)) + return true; + } + break; case T_A_Expr: { A_Expr *expr = (A_Expr *) node; diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index d78b16ed1d..ef85556fc0 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -2775,6 +2775,27 @@ _outSelectStmt(StringInfo str, const SelectStmt *node) WRITE_NODE_FIELD(rarg); } +static void +_outExtensionSetStmt(StringInfo str, const ExtensionSetStmt *node) +{ + WRITE_NODE_TYPE("EXTENSIONSET"); + + WRITE_NODE_FIELD(target); + WRITE_STRING_FIELD(name); + WRITE_NODE_FIELD(indirection); + WRITE_NODE_FIELD(val); + WRITE_NODE_FIELD(fromClause); + WRITE_NODE_FIELD(whereClause); + WRITE_NODE_FIELD(groupClause); + WRITE_NODE_FIELD(havingClause); + WRITE_NODE_FIELD(windowClause); + WRITE_NODE_FIELD(sortClause); + WRITE_NODE_FIELD(limitOffset); + WRITE_NODE_FIELD(limitCount); + WRITE_ENUM_FIELD(limitOption, LimitOption); + WRITE_NODE_FIELD(lockingClause); +} + static void _outFuncCall(StringInfo str, const FuncCall *node) { @@ -4211,6 +4232,9 @@ outNode(StringInfo str, const void *obj) case T_SelectStmt: _outSelectStmt(str, obj); break; + case T_ExtensionSetStmt: + _outExtensionSetStmt(str, obj); + break; case T_ColumnDef: _outColumnDef(str, obj); break; diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index 084e00f73d..9ef1ed4d4c 100644 --- a/src/backend/parser/analyze.c +++ b/src/backend/parser/analyze.c @@ -42,8 +42,10 @@ #include "parser/parse_param.h" #include "parser/parse_relation.h" #include "parser/parse_target.h" +#include "parser/parse_type.h" #include "parser/parsetree.h" #include "rewrite/rewriteManip.h" +#include "utils/builtins.h" #include "utils/rel.h" @@ -70,6 +72,8 @@ static Query *transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt); static List *transformReturningList(ParseState *pstate, List *returningList); static List *transformUpdateTargetList(ParseState *pstate, List *targetList); +static Query *transformExtensionSetStmt(ParseState *pstate, + ExtensionSetStmt *stmt); static Query *transformDeclareCursorStmt(ParseState *pstate, DeclareCursorStmt *stmt); static Query *transformExplainStmt(ParseState *pstate, @@ -304,6 +308,11 @@ transformStmt(ParseState *pstate, Node *parseTree) } break; + case T_ExtensionSetStmt: + result = transformExtensionSetStmt(pstate, + (ExtensionSetStmt *) parseTree); + break; + /* * Special cases */ @@ -367,6 +376,7 @@ analyze_requires_snapshot(RawStmt *parseTree) case T_DeleteStmt: case T_UpdateStmt: case T_SelectStmt: + case T_ExtensionSetStmt: result = true; break; @@ -2393,6 +2403,209 @@ transformReturningList(ParseState *pstate, List *returningList) } +/* + * transformExtensionSetStmt - + * transform an Extension SET Statement, + * SET n: name opt_indirection := a_expr ... + * + * This undocumented syntax allows PL/pgSQL to make use of the regular parser + * for assignment statements. Everything after the colon is taken verbatim + * from the PL/pgSQL assignment statement, while the integer "n" says how many + * dotted names comprise the target ColumnRef. (That part is handled by + * gram.y, so we just see a ColumnRef and a suitably-truncated indirection + * list here.) If there is no opt_indirection, the transformed statement + * looks like "SELECT a_expr ...", except the expression has been cast to + * the type of the target. With indirection, it's still a SELECT, but the + * expression will incorporate FieldStore and/or assignment SubscriptingRef + * nodes to compute a new value for a container-type variable represented by + * the target. The expression references the target as the container source. + */ +static Query * +transformExtensionSetStmt(ParseState *pstate, + ExtensionSetStmt *stmt) +{ + Query *qry = makeNode(Query); + Node *target; + Oid targettype; + int32 targettypmod; + Oid targetcollation; + TargetEntry *tle; + Oid type_id; + Node *qual; + ListCell *l; + + /* + * Transform the target reference. Typically we will get back a Param + * node, but there's no reason to be too picky about its type. + */ + target = transformExpr(pstate, (Node *) stmt->target, + EXPR_KIND_UPDATE_TARGET); + targettype = exprType(target); + targettypmod = exprTypmod(target); + targetcollation = exprCollation(target); + + /* + * The rest mostly matches transformSelectStmt, except that we needn't + * consider WITH or DISTINCT, and we build a targetlist our own way. + */ + + qry->commandType = CMD_SELECT; + pstate->p_is_insert = false; + + /* make FOR UPDATE/FOR SHARE info available to addRangeTableEntry */ + pstate->p_locking_clause = stmt->lockingClause; + + /* make WINDOW info available for window functions, too */ + pstate->p_windowdefs = stmt->windowClause; + + /* process the FROM clause */ + transformFromClause(pstate, stmt->fromClause); + + /* Transform the assignment source (cf. transformUpdateTargetList) */ + tle = transformTargetEntry(pstate, + stmt->val, + NULL, + EXPR_KIND_SELECT_TARGET, + NULL, + false); + + /* + * This next bit is similar to transformAssignedExpr; the key difference + * is we use COERCION_PLPGSQL not COERCION_ASSIGNMENT. + */ + type_id = exprType((Node *) tle->expr); + + pstate->p_expr_kind = EXPR_KIND_UPDATE_TARGET; + + if (stmt->indirection) + { + tle->expr = (Expr *) + transformAssignmentIndirection(pstate, + target, + stmt->name, + false, + targettype, + targettypmod, + targetcollation, + COERCION_PLPGSQL, + stmt->indirection, + list_head(stmt->indirection), + (Node *) tle->expr, + exprLocation(target)); + } + else if (targettype != type_id && + (targettype == RECORDOID || ISCOMPLEX(targettype)) && + (type_id == RECORDOID || ISCOMPLEX(type_id))) + { + /* + * Hack: do not let coerce_to_target_type() deal with inconsistent + * composite types. Just pass the expression result through as-is, + * and let the PL/pgSQL executor do the conversion its way. This is + * rather bogus, but it's needed for backwards compatibility. + */ + } + else + { + /* + * For normal non-qualified target column, do type checking and + * coercion. + */ + Node *orig_expr = (Node *) tle->expr; + + tle->expr = (Expr *) + coerce_to_target_type(pstate, + orig_expr, type_id, + targettype, targettypmod, + COERCION_PLPGSQL, + COERCE_IMPLICIT_CAST, + -1); + /* With COERCION_PLPGSQL, this error is probably unreachable */ + if (tle->expr == NULL) + ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), + errmsg("variable \"%s\" is of type %s" + " but expression is of type %s", + stmt->name, + format_type_be(targettype), + format_type_be(type_id)), + errhint("You will need to rewrite or cast the expression."), + parser_errposition(pstate, exprLocation(orig_expr)))); + } + + pstate->p_expr_kind = EXPR_KIND_NONE; + + qry->targetList = list_make1(tle); + + /* transform WHERE */ + qual = transformWhereClause(pstate, stmt->whereClause, + EXPR_KIND_WHERE, "WHERE"); + + /* initial processing of HAVING clause is much like WHERE clause */ + qry->havingQual = transformWhereClause(pstate, stmt->havingClause, + EXPR_KIND_HAVING, "HAVING"); + + /* + * Transform sorting/grouping stuff. Do ORDER BY first because both + * transformGroupClause and transformDistinctClause need the results. Note + * that these functions can also change the targetList, so it's passed to + * them by reference. + */ + qry->sortClause = transformSortClause(pstate, + stmt->sortClause, + &qry->targetList, + EXPR_KIND_ORDER_BY, + false /* allow SQL92 rules */ ); + + qry->groupClause = transformGroupClause(pstate, + stmt->groupClause, + &qry->groupingSets, + &qry->targetList, + qry->sortClause, + EXPR_KIND_GROUP_BY, + false /* allow SQL92 rules */ ); + + /* No DISTINCT clause */ + qry->distinctClause = NIL; + qry->hasDistinctOn = false; + + /* transform LIMIT */ + qry->limitOffset = transformLimitClause(pstate, stmt->limitOffset, + EXPR_KIND_OFFSET, "OFFSET", + stmt->limitOption); + qry->limitCount = transformLimitClause(pstate, stmt->limitCount, + EXPR_KIND_LIMIT, "LIMIT", + stmt->limitOption); + qry->limitOption = stmt->limitOption; + + /* transform window clauses after we have seen all window functions */ + qry->windowClause = transformWindowDefinitions(pstate, + pstate->p_windowdefs, + &qry->targetList); + + qry->rtable = pstate->p_rtable; + qry->jointree = makeFromExpr(pstate->p_joinlist, qual); + + qry->hasSubLinks = pstate->p_hasSubLinks; + qry->hasWindowFuncs = pstate->p_hasWindowFuncs; + qry->hasTargetSRFs = pstate->p_hasTargetSRFs; + qry->hasAggs = pstate->p_hasAggs; + + foreach(l, stmt->lockingClause) + { + transformLockingClause(pstate, qry, + (LockingClause *) lfirst(l), false); + } + + assign_query_collations(pstate, qry); + + /* this must be done after collations, for reliable comparison of exprs */ + if (pstate->p_hasAggs || qry->groupClause || qry->groupingSets || qry->havingQual) + parseCheckAggregates(pstate, qry); + + return qry; +} + + /* * transformDeclareCursorStmt - * transform a DECLARE CURSOR Statement diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 8f341ac006..5279d23b7b 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -275,7 +275,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); DropCastStmt DropRoleStmt DropdbStmt DropTableSpaceStmt DropTransformStmt - DropUserMappingStmt ExplainStmt FetchStmt + DropUserMappingStmt ExplainStmt ExtensionSetStmt FetchStmt GrantStmt GrantRoleStmt ImportForeignSchemaStmt IndexStmt InsertStmt ListenStmt LoadStmt LockStmt NotifyStmt ExplainableStmt PreparableStmt CreateFunctionStmt AlterFunctionStmt ReindexStmt RemoveAggrStmt @@ -535,7 +535,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); %type <str> ColId ColLabel BareColLabel %type <str> NonReservedWord NonReservedWord_or_Sconst %type <str> var_name type_function_name param_name -%type <str> createdb_opt_name +%type <str> createdb_opt_name extension_set_target %type <node> var_value zone_value %type <rolespec> auth_ident RoleSpec opt_granted_by @@ -916,6 +916,7 @@ stmt : | DropdbStmt | ExecuteStmt | ExplainStmt + | ExtensionSetStmt | FetchStmt | GrantStmt | GrantRoleStmt @@ -11041,6 +11042,73 @@ set_target_list: ; +/***************************************************************************** + * + * QUERY: + * SET n: name opt_indirection := a_expr ... + * + * This undocumented syntax allows PL/pgSQL to make use of the regular parser + * for its assignment statements. The stuff after the a_expr should match + * what can appear after the targetlist in SELECT. + *****************************************************************************/ + +ExtensionSetStmt: SET Iconst ':' extension_set_target opt_indirection + extension_set_assignment a_expr + from_clause where_clause + group_clause having_clause window_clause + opt_sort_clause opt_select_limit opt_for_locking_clause + { + ExtensionSetStmt *n = makeNode(ExtensionSetStmt); + ColumnRef *cref = makeNode(ColumnRef); + int nnames = $2; + List *indirection = $5; + + cref->fields = list_make1(makeString($4)); + cref->location = @4; + while (--nnames > 0 && indirection != NIL) + { + Node *ind = (Node *) linitial(indirection); + + if (!IsA(ind, String)) + elog(ERROR, "invalid name count in extension SET"); + cref->fields = lappend(cref->fields, ind); + indirection = list_delete_first(indirection); + } + n->target = cref; + n->name = $4; + n->indirection = check_indirection(indirection, yyscanner); + n->val = $7; + n->fromClause = $8; + n->whereClause = $9; + n->groupClause = $10; + n->havingClause = $11; + n->windowClause = $12; + n->sortClause = $13; + if ($14) + { + n->limitOffset = $14->limitOffset; + n->limitCount = $14->limitCount; + if (!n->sortClause && + $14->limitOption == LIMIT_OPTION_WITH_TIES) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("WITH TIES cannot be specified without ORDER BY clause"))); + n->limitOption = $14->limitOption; + } + n->lockingClause = $15; + $$ = (Node *) n; + } + ; + +extension_set_target: ColId { $$ = $1; } + | PARAM { $$ = psprintf("$%d", $1); } + ; + +extension_set_assignment: COLON_EQUALS + | '=' + ; + + /***************************************************************************** * * QUERY: diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c index da6c3ae4b5..462e2af74e 100644 --- a/src/backend/parser/parse_coerce.c +++ b/src/backend/parser/parse_coerce.c @@ -2815,6 +2815,14 @@ find_coercion_pathway(Oid targetTypeId, Oid sourceTypeId, } } + /* + * When parsing PL/pgSQL assignments, allow an I/O cast to be used + * whenever no normal coercion is available. + */ + if (result == COERCION_PATH_NONE && + ccontext == COERCION_PLPGSQL) + result = COERCION_PATH_COERCEVIAIO; + return result; } diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c index 3dda8e2847..55a9a808da 100644 --- a/src/backend/parser/parse_target.c +++ b/src/backend/parser/parse_target.c @@ -34,23 +34,13 @@ static void markTargetListOrigin(ParseState *pstate, TargetEntry *tle, Var *var, int levelsup); -static Node *transformAssignmentIndirection(ParseState *pstate, - Node *basenode, - const char *targetName, - bool targetIsSubscripting, - Oid targetTypeId, - int32 targetTypMod, - Oid targetCollation, - List *indirection, - ListCell *indirection_cell, - Node *rhs, - int location); static Node *transformAssignmentSubscripts(ParseState *pstate, Node *basenode, const char *targetName, Oid targetTypeId, int32 targetTypMod, Oid targetCollation, + CoercionContext ccontext, List *subscripts, bool isSlice, List *indirection, @@ -558,6 +548,7 @@ transformAssignedExpr(ParseState *pstate, attrtype, attrtypmod, attrcollation, + COERCION_ASSIGNMENT, indirection, list_head(indirection), (Node *) expr, @@ -648,9 +639,9 @@ updateTargetListEntry(ParseState *pstate, * operation. * * In the initial call, basenode is a Var for the target column in UPDATE, - * or a null Const of the target's type in INSERT. In recursive calls, - * basenode is NULL, indicating that a substitute node should be consed up if - * needed. + * or a null Const of the target's type in INSERT, or a Param for the target + * variable in extension SET operations. In recursive calls, basenode is + * NULL, indicating that a substitute node should be consed up if needed. * * targetName is the name of the field or subfield we're assigning to, and * targetIsSubscripting is true if we're subscripting it. These are just for @@ -660,6 +651,10 @@ updateTargetListEntry(ParseState *pstate, * collation of the object to be assigned to (initially the target column, * later some subobject). * + * ccontext is the coercion level to use while coercing the "rhs". For + * normal statements it'll be COERCION_ASSIGNMENT, but extension SET uses + * a special value. + * * indirection is the list of indirection nodes, and indirection_cell is the * start of the sublist remaining to process. When it's NULL, we're done * recursing and can just coerce and return the RHS. @@ -672,7 +667,7 @@ updateTargetListEntry(ParseState *pstate, * might want to decorate indirection cells with their own location info, * in which case the location argument could probably be dropped.) */ -static Node * +Node * transformAssignmentIndirection(ParseState *pstate, Node *basenode, const char *targetName, @@ -680,6 +675,7 @@ transformAssignmentIndirection(ParseState *pstate, Oid targetTypeId, int32 targetTypMod, Oid targetCollation, + CoercionContext ccontext, List *indirection, ListCell *indirection_cell, Node *rhs, @@ -752,6 +748,7 @@ transformAssignmentIndirection(ParseState *pstate, targetTypeId, targetTypMod, targetCollation, + ccontext, subscripts, isSlice, indirection, @@ -804,6 +801,7 @@ transformAssignmentIndirection(ParseState *pstate, fieldTypeId, fieldTypMod, fieldCollation, + ccontext, indirection, lnext(indirection, i), rhs, @@ -840,6 +838,7 @@ transformAssignmentIndirection(ParseState *pstate, targetTypeId, targetTypMod, targetCollation, + ccontext, subscripts, isSlice, indirection, @@ -853,7 +852,7 @@ transformAssignmentIndirection(ParseState *pstate, result = coerce_to_target_type(pstate, rhs, exprType(rhs), targetTypeId, targetTypMod, - COERCION_ASSIGNMENT, + ccontext, COERCE_IMPLICIT_CAST, -1); if (result == NULL) @@ -893,6 +892,7 @@ transformAssignmentSubscripts(ParseState *pstate, Oid targetTypeId, int32 targetTypMod, Oid targetCollation, + CoercionContext ccontext, List *subscripts, bool isSlice, List *indirection, @@ -946,6 +946,7 @@ transformAssignmentSubscripts(ParseState *pstate, typeNeeded, typmodNeeded, collationNeeded, + ccontext, indirection, next_indirection, rhs, @@ -969,7 +970,7 @@ transformAssignmentSubscripts(ParseState *pstate, result = coerce_to_target_type(pstate, result, resulttype, targetTypeId, targetTypMod, - COERCION_ASSIGNMENT, + ccontext, COERCE_IMPLICIT_CAST, -1); /* can fail if we had int2vector/oidvector, but not for true domains */ diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c index a42ead7d69..a2e3b332c7 100644 --- a/src/backend/tcop/utility.c +++ b/src/backend/tcop/utility.c @@ -2313,6 +2313,10 @@ CreateCommandTag(Node *parsetree) tag = CMDTAG_SELECT; break; + case T_ExtensionSetStmt: + tag = CMDTAG_SELECT; + break; + /* utility statements --- same whether raw or cooked */ case T_TransactionStmt: { @@ -3181,6 +3185,10 @@ GetCommandLogLevel(Node *parsetree) lev = LOGSTMT_ALL; break; + case T_ExtensionSetStmt: + lev = LOGSTMT_ALL; + break; + /* utility statements --- same whether raw or cooked */ case T_TransactionStmt: lev = LOGSTMT_ALL; diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h index 3684f87a88..273b8181e4 100644 --- a/src/include/nodes/nodes.h +++ b/src/include/nodes/nodes.h @@ -314,6 +314,7 @@ typedef enum NodeTag T_DeleteStmt, T_UpdateStmt, T_SelectStmt, + T_ExtensionSetStmt, T_AlterTableStmt, T_AlterTableCmd, T_AlterDomainStmt, diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index 48a79a7657..4d073f862d 100644 --- a/src/include/nodes/parsenodes.h +++ b/src/include/nodes/parsenodes.h @@ -1675,6 +1675,41 @@ typedef struct SetOperationStmt } SetOperationStmt; +/* ---------------------- + * Extension SET Statement + * + * Like SelectStmt, this is transformed into a SELECT Query. + * However, the targetlist of the result looks more like an UPDATE. + * ---------------------- + */ +typedef struct ExtensionSetStmt +{ + NodeTag type; + + /* These fields represent "target opt_indirection := value" */ + ColumnRef *target; /* ColumnRef for the target variable */ + char *name; /* initial column name */ + List *indirection; /* subscripts and field names, or NIL */ + Node *val; /* the value expression to assign */ + + /* + * Historically, PL/pgSQL has allowed an assignment to have any decoration + * that could occur after the targetlist of SELECT, so we preserve that + * capability. + */ + List *fromClause; /* the FROM clause */ + Node *whereClause; /* WHERE qualification */ + List *groupClause; /* GROUP BY clauses */ + Node *havingClause; /* HAVING conditional-expression */ + List *windowClause; /* WINDOW window_name AS (...), ... */ + List *sortClause; /* sort clause (a list of SortBy's) */ + Node *limitOffset; /* # of result tuples to skip */ + Node *limitCount; /* # of result tuples to return */ + LimitOption limitOption; /* limit type */ + List *lockingClause; /* FOR UPDATE (list of LockingClause's) */ +} ExtensionSetStmt; + + /***************************************************************************** * Other Statements (no optimizations required) * diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h index dd85908fe2..71a58804dc 100644 --- a/src/include/nodes/primnodes.h +++ b/src/include/nodes/primnodes.h @@ -457,6 +457,7 @@ typedef enum CoercionContext { COERCION_IMPLICIT, /* coercion in context of expression */ COERCION_ASSIGNMENT, /* coercion in context of assignment */ + COERCION_PLPGSQL, /* if no assignment cast, use CoerceViaIO */ COERCION_EXPLICIT /* explicit cast operation */ } CoercionContext; diff --git a/src/include/parser/parse_target.h b/src/include/parser/parse_target.h index 7039df29cb..6f5741f824 100644 --- a/src/include/parser/parse_target.h +++ b/src/include/parser/parse_target.h @@ -36,6 +36,18 @@ extern void updateTargetListEntry(ParseState *pstate, TargetEntry *tle, char *colname, int attrno, List *indirection, int location); +extern Node *transformAssignmentIndirection(ParseState *pstate, + Node *basenode, + const char *targetName, + bool targetIsSubscripting, + Oid targetTypeId, + int32 targetTypMod, + Oid targetCollation, + CoercionContext ccontext, + List *indirection, + ListCell *indirection_cell, + Node *rhs, + int location); extern List *checkInsertTargets(ParseState *pstate, List *cols, List **attrnos); extern TupleDesc expandRecordVariable(ParseState *pstate, Var *var, diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index ccbc50fc45..22fe037f90 100644 --- a/src/pl/plpgsql/src/pl_exec.c +++ b/src/pl/plpgsql/src/pl_exec.c @@ -8005,10 +8005,14 @@ get_cast_hashentry(PLpgSQL_execstate *estate, placeholder->collation = get_typcollation(srctype); /* - * Apply coercion. We use ASSIGNMENT coercion because that's the - * closest match to plpgsql's historical behavior; in particular, - * EXPLICIT coercion would allow silent truncation to a destination - * varchar/bpchar's length, which we do not want. + * Apply coercion. We use the special coercion context + * COERCION_PLPGSQL to match plpgsql's historical behavior, namely + * that any cast not available at ASSIGNMENT level will be implemented + * as an I/O coercion. (It's somewhat dubious that we prefer I/O + * coercion over cast pathways that exist at EXPLICIT level. Changing + * that would cause assorted minor behavioral differences though, and + * a user who wants the explicit-cast behavior can always write an + * explicit cast.) * * If source type is UNKNOWN, coerce_to_target_type will fail (it only * expects to see that for Const input nodes), so don't call it; we'll @@ -8021,7 +8025,7 @@ get_cast_hashentry(PLpgSQL_execstate *estate, cast_expr = coerce_to_target_type(NULL, (Node *) placeholder, srctype, dsttype, dsttypmod, - COERCION_ASSIGNMENT, + COERCION_PLPGSQL, COERCE_IMPLICIT_CAST, -1); @@ -8029,7 +8033,8 @@ get_cast_hashentry(PLpgSQL_execstate *estate, * If there's no cast path according to the parser, fall back to using * an I/O coercion; this is semantically dubious but matches plpgsql's * historical behavior. We would need something of the sort for - * UNKNOWN literals in any case. + * UNKNOWN literals in any case. (This is probably now only reachable + * in the case where srctype is UNKNOWN/RECORD.) */ if (cast_expr == NULL) { @@ -8338,7 +8343,8 @@ exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno) return; /* - * Top level of expression must be a simple FuncExpr or OpExpr. + * Top level of expression must be a simple FuncExpr, OpExpr, or + * SubscriptingRef. */ if (IsA(expr->expr_simple_expr, FuncExpr)) { @@ -8354,6 +8360,33 @@ exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno) funcid = opexpr->opfuncid; fargs = opexpr->args; } + else if (IsA(expr->expr_simple_expr, SubscriptingRef)) + { + SubscriptingRef *sbsref = (SubscriptingRef *) expr->expr_simple_expr; + + /* We only trust standard varlena arrays to be safe */ + if (get_typsubscript(sbsref->refcontainertype, NULL) != + F_ARRAY_SUBSCRIPT_HANDLER) + return; + + /* refexpr can be a simple Param, otherwise must not contain target */ + if (!(sbsref->refexpr && IsA(sbsref->refexpr, Param)) && + contains_target_param((Node *) sbsref->refexpr, &target_dno)) + return; + + /* the other subexpressions must not contain target */ + if (contains_target_param((Node *) sbsref->refupperindexpr, + &target_dno) || + contains_target_param((Node *) sbsref->reflowerindexpr, + &target_dno) || + contains_target_param((Node *) sbsref->refassgnexpr, + &target_dno)) + return; + + /* OK, we can pass target as a read-write parameter */ + expr->rwparam = target_dno; + return; + } else return; diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y index 8227bf0449..561b0dec59 100644 --- a/src/pl/plpgsql/src/pl_gram.y +++ b/src/pl/plpgsql/src/pl_gram.y @@ -949,16 +949,21 @@ stmt_call : K_CALL } ; -stmt_assign : assign_var assign_operator expr_until_semi +stmt_assign : T_DATUM { PLpgSQL_stmt_assign *new; + int nnames = $1.ident ? 1 : list_length($1.idents); + check_assignable($1.datum, @1); new = palloc0(sizeof(PLpgSQL_stmt_assign)); new->cmd_type = PLPGSQL_STMT_ASSIGN; new->lineno = plpgsql_location_to_lineno(@1); new->stmtid = ++plpgsql_curr_compile->nstatements; - new->varno = $1->dno; - new->expr = $3; + new->varno = $1.datum->dno; + /* Push back the head name to include it in the stmt */ + plpgsql_push_back_token(T_DATUM); + new->expr = read_sql_stmt(psprintf("SET %d: ", + nnames)); $$ = (PLpgSQL_stmt *)new; }
On 12/11/20 12:21, Tom Lane wrote: > solution I adopted was just to invent a new CoercionContext value > COERCION_PLPGSQL, representing "use pl/pgsql's rules". (Basically > what that means nowadays is to apply CoerceViaIO if assignment cast > lookup doesn't find a cast pathway.) That seems like a rule that might be of use in other PLs or extensions; could it have a more generic name, COERCION_FALLBACK or something? > is now quoted, but the "SET n:" bit in front of it might confuse people, > especially if we don't document this new syntax (which I'm inclined not > to, since it's useless in straight SQL). If it's true that the only choices for n: are 1: or 2:, maybe it would look less confusing in an error message to, hmm, decree that this specialized SET form /always/ takes a two-component name, but accept something special like ROUTINE.x (or UNNAMED.x or NULL.x or something) for the case where there isn't a qualifying label in the plpgsql source? It's still a strange arbitrary creation, but might give more of a hint of its meaning if it crops up in an error message somewhere. Regards, -Chap
Hi
It is great. I expected much more work.
pá 11. 12. 2020 v 18:21 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
We've had complaints in the past about how plpgsql can't handle
assignments to fields in arrays of records [1], that is cases like
arrayvar[n].field := something;
and also complaints about how plpgsql can't handle assignments
to array slices [2], ie
arrayvar[m:n] := something;
As of commit c7aba7c14, we have another problem, namely that
plpgsql's subscripted assignment only works for regular arrays;
it won't work for other types that might define subscript
assignment handlers.
So I started to think about how to fix that, and eventually
decided that what we ought to do is nuke plpgsql's array-assignment
code altogether. The core code already has support for everything
we want here in the context of field/element assignments in UPDATE
commands; if we could get plpgsql to make use of that infrastructure
instead of rolling its own, we'd be a lot better off.
The hard part of that is that the core parser will only generate
the structures we need (FieldStores and assignment SubscriptingRefs)
within UPDATE commands. We could export the relevant functions
(particularly transformAssignmentIndirection); but that won't help
plpgsql very much, because it really wants to be able to run all this
stuff through SPI. That means we have to have SQL syntax that can
generate an expression of that form.
That led me to think about introducing a new statement, say
SET variable_name opt_indirection := a_expr
SQL/PSM (ANSI SQL) defines SET var = expr
If you introduce a new statement - LET, then it can be less confusing for users, and this statement can be the foundation for schema variables. With this statement the implementation of schema variables is significantly simpler.
Regards
Pavel
where opt_indirection is gram.y's symbol for "field selections and/or
subscripts". The idea here is that a plpgsql statement like
x[2].fld := something;
would be parsed using this new statement, producing an expression
that uses an assignment SubscriptingRef and a FieldStore operating
on a Param that gives the initial value of the array-of-composite
variable "x". Then plpgsql would just evaluate this expression and
assign the result to x. Problem solved.
This almost works as-is, modulo annoying parse conflicts against the
existing variants of SET. However there's a nasty little detail
about what "variable_name" can be in plpgsql: it can be either one or
two identifiers, since there might be a block label involved, eg
<<mylabel>> declare x int; begin mylabel.x := ...
Between that and the parse-conflict problem, I ended up
with this syntax:
SET n: variable_name opt_indirection := a_expr
where "n" is an integer literal indicating how many dot-separated names
should be taken as the base variable name. Another annoying point is
that plpgsql historically has allowed fun stuff like
mycount := count(*) from my_table where ...;
that is, after the expression you can have all the rest of an ordinary
SELECT command. That's not terribly hard to deal with, but it means
that this new statement has to have all of SELECT's other options too.
The other area that doesn't quite work without some kind of hack is
that plpgsql's casting rules for which types can be assigned to what
are far laxer than what the core parser thinks should be allowed in
UPDATE. The cast has to happen within the assignment expression
for this to work at all, so plpgsql can't fix it by itself. The
solution I adopted was just to invent a new CoercionContext value
COERCION_PLPGSQL, representing "use pl/pgsql's rules". (Basically
what that means nowadays is to apply CoerceViaIO if assignment cast
lookup doesn't find a cast pathway.)
A happy side-effect of this approach is that it actually makes
some cases faster. In particular I can measure speedups for
(a) assignments to subscripted variables and (b) cases where a
coercion must be performed to produce the result to be assigned.
I believe the reason for this is that the patch effectively
merges what had been separate expressions (subscripts or casts,
respectively) into the main result-producing expression. This
eliminates a nontrivial amount of overhead for plancache validity
checking, execution startup, etc.
Another side-effect is that the report of the statement in error
cases might look different. For example, in v13 a typo in a
subscript expression produces
regression=# do $$ declare x int[]; begin x[!2] = 43; end $$;
ERROR: operator does not exist: ! integer
LINE 1: SELECT !2
^
HINT: No operator matches the given name and argument type. You might need to add an explicit type cast.
QUERY: SELECT !2
CONTEXT: PL/pgSQL function inline_code_block line 1 at assignment
With this patch, you get
regression=# do $$ declare x int[]; begin x[!2] = 43; end $$;
ERROR: operator does not exist: ! integer
LINE 1: SET 1: x[!2] = 43
^
HINT: No operator matches the given name and argument type. You might need to add an explicit type cast.
QUERY: SET 1: x[!2] = 43
CONTEXT: PL/pgSQL function inline_code_block line 1 at assignment
It seems like a clear improvement to me that the whole plpgsql statement
is now quoted, but the "SET n:" bit in front of it might confuse people,
especially if we don't document this new syntax (which I'm inclined not
to, since it's useless in straight SQL). On the other hand, the
"SELECT" that you got with the old code was confusing to novices too.
Maybe something could be done to suppress those prefixes in error
reports? Seems like a matter for another patch. We could also use
some other prefix --- there's nothing particularly magic about the
word "SET" here, except that it already exists as a keyword --- but
I didn't think of anything I liked better.
This is still WIP: I've not added any new regression test cases
nor looked at the docs, and there's more cleanup needed in plpgsql.
But it passes check-world, so I thought I'd put it out for comments.
regards, tom lane
[1] https://www.postgresql.org/message-id/A3691E98-CCA5-4DEB-B43C-92AD0437E09E%40mikatiming.de
[2] https://www.postgresql.org/message-id/1070.1451345954%40sss.pgh.pa.us
Chapman Flack <chap@anastigmatix.net> writes: > On 12/11/20 12:21, Tom Lane wrote: >> solution I adopted was just to invent a new CoercionContext value >> COERCION_PLPGSQL, representing "use pl/pgsql's rules". (Basically >> what that means nowadays is to apply CoerceViaIO if assignment cast >> lookup doesn't find a cast pathway.) > That seems like a rule that might be of use in other PLs or extensions; > could it have a more generic name, COERCION_FALLBACK or something? I'm not wedded to that name, but I doubt that it's semantics that we really want to encourage anyone else to use. In particular, the fact that it's not a superset of COERCION_EXPLICIT is pretty darn weird, with little except backwards compatibility to recommend it. >> is now quoted, but the "SET n:" bit in front of it might confuse people, >> especially if we don't document this new syntax (which I'm inclined not >> to, since it's useless in straight SQL). > If it's true that the only choices for n: are 1: or 2:, maybe it would look > less confusing in an error message to, hmm, decree that this specialized SET > form /always/ takes a two-component name, but accept something special like > ROUTINE.x (or UNNAMED.x or NULL.x or something) for the case where there > isn't a qualifying label in the plpgsql source? As the patch stands, it's still using the RECFIELD code paths, which means that there could be three-component target variable names (label.variable.field). If we were to get rid of that and expect top-level field assignment to also be handled by this new mechanism, then maybe your idea could be made to work. But I have not tried to implement that here, as I don't see how to make it work for RECORD-type variables (where the names and types of the fields aren't determinate). In any case, that approach still involves inserting some query text that the user didn't write, so I'm not sure how much confusion it'd remove. The "SET n:" business at least looks like it's some weird prefix comparable to "LINE n:", so while people wouldn't understand it I think they'd easily see it as something the system prefixed to their query. Looking a bit ahead, it's not too hard to imagine plpgsql wishing to pass other sorts of annotations through SPI and down to the core parser. Maybe we should think about a more general way to do that in an out-of-band, not-visible-in-the-query-text fashion. regards, tom lane
I wrote: > In any case, that approach still involves inserting some query text > that the user didn't write, so I'm not sure how much confusion it'd > remove. The "SET n:" business at least looks like it's some weird > prefix comparable to "LINE n:", so while people wouldn't understand > it I think they'd easily see it as something the system prefixed > to their query. > Looking a bit ahead, it's not too hard to imagine plpgsql wishing > to pass other sorts of annotations through SPI and down to the core > parser. Maybe we should think about a more general way to do that > in an out-of-band, not-visible-in-the-query-text fashion. I have an idea (no code written yet) about this. After looking around, it seems like the ParserSetupHook mechanism is plenty for anything we might want an extension to be able to change in the behavior of parse analysis. The hooks that we currently allow that to set affect only the interpretation of variable names and $N parameter symbols, but we could surely add much more in that line as needed. What we lack is any good way for an extension to control the behavior of raw_parser() (i.e., gram.y). Currently, plpgsql prefixes "SELECT " to expressions it might want to parse, and now my current patch proposes to prefix something else to get a different grammar behavior. Another example of a very similar problem is typeStringToTypeName(), which prefixes a string it expects to be a type name with "SELECT NULL::", and then has to do a bunch of kluges to deal with the underspecification involved in that. Based on these examples, we need some sort of "overall goal" option for the raw parser, but maybe not more than that --- other things you might want tend to fall into the parse analysis side of things. So my idea here is to add a parsing-mode option to raw_parser(), which would be an enum with values like "normal SQL statement", "expression only", "type name", "plpgsql assignment statement". The problem I had with not knowing how many dotted names to absorb at the start of an assignment statement could be finessed by inventing "assignment1", "assignment2", and "assignment3" parsing modes; that's a little bit ugly but not enough to make me think we need a wider API. As to how it could actually work, I'm noticing that raw_parser starts out by initializing yyextra's lookahead buffer to empty. For the parsing modes other than "normal SQL statement", it could instead inject a lookahead token that is a code that cannot be generated by the regular lexer. Then gram.y could have productions like EXPRESSION_MODE a_expr { ... generate parse tree ... } where EXPRESSION_MODE is one of these special tokens. And now we have something that will parse an a_expr, and only an a_expr, and we don't need any special "SELECT " or any other prefix in the user-visible source string. Similarly for the other special parsing modes. Essentially, this is a way of having a few distinct parsers that share a common base of productions, without the bloat and code maintenance issues of building actually-distinct parsers. A small problem with this is that the images of these special productions in ECPG would be dead code so far as ECPG is concerned. For the use-cases I can foresee, there wouldn't be enough special productions for that to be a deal-breaker. But we could probably teach the ECPG grammar-building scripts to filter out these productions if it ever got to be annoying. regards, tom lane
I wrote: > So my idea here is to add a parsing-mode option to raw_parser(), > which would be an enum with values like "normal SQL statement", > "expression only", "type name", "plpgsql assignment statement". Here's a fleshed-out patch series that attacks things that way. I'm a lot better pleased with this than with my original approach. 0001 creates the basic infrastructure for "raw parse modes", and as proof of concept simplifies typeStringToTypeName(). There's a minor functional improvement there, which is that we can now use the core parser's error cursor position, so instead of regression=# do $$ declare x int[23/] ; begin end $$; ERROR: syntax error at or near "/" LINE 1: do $$ declare x int[23/] ; begin end $$; ^ CONTEXT: invalid type name "int[23/] " you get regression=# do $$ declare x int[23/] ; begin end $$; ERROR: syntax error at or near "/" LINE 1: do $$ declare x int[23/] ; begin end $$; ^ CONTEXT: invalid type name "int[23/] " It's possible we could dispense with the error context callback in typeStringToTypeName altogether, but I've not experimented much. 0002 tackles the next problem, which is to make this feature accessible through SPI. There are a couple of possibly-controversial choices here. Following the principle that we should avoid changing documented SPI interfaces, we need a new version of SPI_prepare to pass RawParseMode through. This'll be the fourth one :-(, so I decided it was time to try to make a definition that can stay API-compatible through future changes. So it takes a struct of options, and I added a promise that zeroing the struct is enough to guarantee forward compatibility through future additions. This leaves both of the previous iterations, SPI_prepare_cursor and SPI_prepare_params, unused anywhere in the core code. I suppose we can't kill them (codesearch.debian.net knows of some external uses) but I propose to mark them deprecated, with an eye to at least removing their documentation someday. I did not want to add a RawParseMode parameter to pg_parse_query(), because that would have affected a larger number of unrelated modules, and it would not have been great from a header-inclusion footprint standpoint either. So I chose to pass down the mode from SPI by having it just call raw_parser() directly instead of going through pg_parse_query(). Perhaps this is a modularity violation, or perhaps there's somebody who really wants the extra tracing overhead in pg_parse_query() to apply to SPI queries. I'm open to discussing whether this should be done differently. (However, having made these two patches, I'm now wondering whether there is any rhyme or reason to the existing state of affairs with some callers going through pg_parse_query() while others use raw_parser() directly. It's hard to knock making a different choice in spi.c unless we have a coherent policy about which to use where.) Next, 0003 invents a raw parse mode for plpgsql expressions (which, in some contexts, can be pretty nearly whole SELECT statements), and uses that to get plpgsql out of the business of prefixing "SELECT " to user-written text. I would not have bothered with this as a standalone fix, but I think it does make for less-confusing error messages --- we've definitely had novices ask "where'd this SELECT come from?" in the past. (I cheated a bit on PERFORM, though. Unlike other places, it needs to allow UNION, so it can't use the same restricted syntax.) 0004 then reimplements plpgsql assignment. This is essentially the same patch I submitted before, but redesigned to work with the infrastructure from 0001-0003. 0005 adds documentation and test cases. It also fixes a couple of pre-existing problems that the plpgsql parser had with assigning to sub-fields of record fields, which I discovered while making the tests. Finally, 0006 removes plpgsql's ARRAYELEM datum type, on the grounds that we don't need it anymore. This might be a little controversial too, because there was still one way to reach the code: GET DIAGNOSTICS with an array element as target would do so. However, that seems like a pretty weird corner case. Reviewing the git history, I find that I added support for that in commit 55caaaeba; but a check of the associated discussion shows that there was no actual user request for that, I'd just done it because it was easy and seemed more symmetric. The amount of code involved here seems way more than is justified by that one case, so I think we should just take it out and lose the "feature". (I did think about whether GET DIAGNOSTICS could be reimplemented on top of the new infrastructure, but it wouldn't be easy because we don't have a SQL-expression representation of the GET DIAGNOSTICS values. Moreover, going in that direction would add an expression evaluation, making GET DIAGNOSTICS slower. So I think we should just drop it.) regards, tom lane diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 1fa9f19f08..9e7f1590b5 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -12095,7 +12095,7 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd, * parse_analyze() or the rewriter, but instead we need to pass them * through parse_utilcmd.c to make them ready for execution. */ - raw_parsetree_list = raw_parser(cmd); + raw_parsetree_list = raw_parser(cmd, RAW_PARSE_DEFAULT); querytree_list = NIL; foreach(list_item, raw_parsetree_list) { diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 8f341ac006..88c76dd985 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -723,6 +723,15 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); */ %token NOT_LA NULLS_LA WITH_LA +/* + * The grammar likewise thinks these tokens are keywords, but they are never + * generated by the scanner. Rather, they can be injected by parser.c as + * the initial token of the string (using the lookahead-token mechanism + * implemented there). This provides a way to tell the grammar to parse + * something other than the usual list of SQL commands. + */ +%token MODE_TYPE_NAME + /* Precedence: lowest to highest */ %nonassoc SET /* see relation_expr_opt_alias */ @@ -787,11 +796,18 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); /* * The target production for the whole parse. + * + * Ordinarily we parse a list of statements, but if we see one of the + * special MODE_XXX symbols as first token, we parse something else. */ stmtblock: stmtmulti { pg_yyget_extra(yyscanner)->parsetree = $1; } + | MODE_TYPE_NAME Typename + { + pg_yyget_extra(yyscanner)->parsetree = list_make1($2); + } ; /* diff --git a/src/backend/parser/parse_type.c b/src/backend/parser/parse_type.c index 2709f6f9c7..633774b733 100644 --- a/src/backend/parser/parse_type.c +++ b/src/backend/parser/parse_type.c @@ -719,13 +719,6 @@ pts_error_callback(void *arg) const char *str = (const char *) arg; errcontext("invalid type name \"%s\"", str); - - /* - * Currently we just suppress any syntax error position report, rather - * than transforming to an "internal query" error. It's unlikely that a - * type name is complex enough to need positioning. - */ - errposition(0); } /* @@ -737,11 +730,7 @@ pts_error_callback(void *arg) TypeName * typeStringToTypeName(const char *str) { - StringInfoData buf; List *raw_parsetree_list; - SelectStmt *stmt; - ResTarget *restarget; - TypeCast *typecast; TypeName *typeName; ErrorContextCallback ptserrcontext; @@ -749,9 +738,6 @@ typeStringToTypeName(const char *str) if (strspn(str, " \t\n\r\f") == strlen(str)) goto fail; - initStringInfo(&buf); - appendStringInfo(&buf, "SELECT NULL::%s", str); - /* * Setup error traceback support in case of ereport() during parse */ @@ -760,58 +746,18 @@ typeStringToTypeName(const char *str) ptserrcontext.previous = error_context_stack; error_context_stack = &ptserrcontext; - raw_parsetree_list = raw_parser(buf.data); + raw_parsetree_list = raw_parser(str, RAW_PARSE_TYPE_NAME); error_context_stack = ptserrcontext.previous; - /* - * Make sure we got back exactly what we expected and no more; paranoia is - * justified since the string might contain anything. - */ - if (list_length(raw_parsetree_list) != 1) - goto fail; - stmt = (SelectStmt *) linitial_node(RawStmt, raw_parsetree_list)->stmt; - if (stmt == NULL || - !IsA(stmt, SelectStmt) || - stmt->distinctClause != NIL || - stmt->intoClause != NULL || - stmt->fromClause != NIL || - stmt->whereClause != NULL || - stmt->groupClause != NIL || - stmt->havingClause != NULL || - stmt->windowClause != NIL || - stmt->valuesLists != NIL || - stmt->sortClause != NIL || - stmt->limitOffset != NULL || - stmt->limitCount != NULL || - stmt->lockingClause != NIL || - stmt->withClause != NULL || - stmt->op != SETOP_NONE) - goto fail; - if (list_length(stmt->targetList) != 1) - goto fail; - restarget = (ResTarget *) linitial(stmt->targetList); - if (restarget == NULL || - !IsA(restarget, ResTarget) || - restarget->name != NULL || - restarget->indirection != NIL) - goto fail; - typecast = (TypeCast *) restarget->val; - if (typecast == NULL || - !IsA(typecast, TypeCast) || - typecast->arg == NULL || - !IsA(typecast->arg, A_Const)) - goto fail; + /* We should get back exactly one TypeName node. */ + Assert(list_length(raw_parsetree_list) == 1); + typeName = linitial_node(TypeName, raw_parsetree_list); - typeName = typecast->typeName; - if (typeName == NULL || - !IsA(typeName, TypeName)) - goto fail; + /* The grammar allows SETOF in TypeName, but we don't want that here. */ if (typeName->setof) goto fail; - pfree(buf.data); - return typeName; fail: diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c index be86eb37fe..b04be3bca5 100644 --- a/src/backend/parser/parser.c +++ b/src/backend/parser/parser.c @@ -35,11 +35,11 @@ static char *str_udeescape(const char *str, char escape, * raw_parser * Given a query in string form, do lexical and grammatical analysis. * - * Returns a list of raw (un-analyzed) parse trees. The immediate elements - * of the list are always RawStmt nodes. + * Returns a list of raw (un-analyzed) parse trees. The contents of the + * list have the form required by the specified RawParseMode. */ List * -raw_parser(const char *str) +raw_parser(const char *str, RawParseMode mode) { core_yyscan_t yyscanner; base_yy_extra_type yyextra; @@ -49,8 +49,22 @@ raw_parser(const char *str) yyscanner = scanner_init(str, &yyextra.core_yy_extra, &ScanKeywords, ScanKeywordTokens); - /* base_yylex() only needs this much initialization */ - yyextra.have_lookahead = false; + /* base_yylex() only needs us to initialize the lookahead token, if any */ + if (mode == RAW_PARSE_DEFAULT) + yyextra.have_lookahead = false; + else + { + /* this array is indexed by RawParseMode enum */ + static const int mode_token[] = { + 0, /* RAW_PARSE_DEFAULT */ + MODE_TYPE_NAME /* RAW_PARSE_TYPE_NAME */ + }; + + yyextra.have_lookahead = true; + yyextra.lookahead_token = mode_token[mode]; + yyextra.lookahead_yylloc = 0; + yyextra.lookahead_end = NULL; + } /* initialize the bison parser */ parser_init(&yyextra); @@ -104,7 +118,8 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner) cur_token = yyextra->lookahead_token; lvalp->core_yystype = yyextra->lookahead_yylval; *llocp = yyextra->lookahead_yylloc; - *(yyextra->lookahead_end) = yyextra->lookahead_hold_char; + if (yyextra->lookahead_end) + *(yyextra->lookahead_end) = yyextra->lookahead_hold_char; yyextra->have_lookahead = false; } else diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 3679799e50..64c06f615b 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -634,7 +634,7 @@ pg_parse_query(const char *query_string) if (log_parser_stats) ResetUsage(); - raw_parsetree_list = raw_parser(query_string); + raw_parsetree_list = raw_parser(query_string, RAW_PARSE_DEFAULT); if (log_parser_stats) ShowUsage("PARSER STATISTICS"); diff --git a/src/include/parser/parser.h b/src/include/parser/parser.h index 3bdeeb8b0b..ebbcb788cc 100644 --- a/src/include/parser/parser.h +++ b/src/include/parser/parser.h @@ -18,6 +18,24 @@ #include "nodes/parsenodes.h" +/* + * RawParseMode determines the form of the string that raw_parser() accepts: + * + * RAW_PARSE_DEFAULT: parse a semicolon-separated list of SQL commands, + * and return a List of RawStmt nodes. + * + * RAW_PARSE_TYPE_NAME: parse a type name, and return a one-element List + * containing a TypeName node. + * + * ... more to come ... + */ +typedef enum +{ + RAW_PARSE_DEFAULT = 0, + RAW_PARSE_TYPE_NAME +} RawParseMode; + +/* Values for the backslash_quote GUC */ typedef enum { BACKSLASH_QUOTE_OFF, @@ -32,7 +50,7 @@ extern PGDLLIMPORT bool standard_conforming_strings; /* Primary entry point for the raw parsing functions */ -extern List *raw_parser(const char *str); +extern List *raw_parser(const char *str, RawParseMode mode); /* Utility functions exported by gram.y (perhaps these should be elsewhere) */ extern List *SystemFuncName(char *name); diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y index 8227bf0449..26732d875e 100644 --- a/src/pl/plpgsql/src/pl_gram.y +++ b/src/pl/plpgsql/src/pl_gram.y @@ -3661,7 +3661,7 @@ check_sql_expr(const char *stmt, int location, int leaderlen) error_context_stack = &syntax_errcontext; oldCxt = MemoryContextSwitchTo(plpgsql_compile_tmp_cxt); - (void) raw_parser(stmt); + (void) raw_parser(stmt, RAW_PARSE_DEFAULT); MemoryContextSwitchTo(oldCxt); /* Restore former ereport callback */ diff --git a/doc/src/sgml/spi.sgml b/doc/src/sgml/spi.sgml index 7752de0a4d..f0dcba946e 100644 --- a/doc/src/sgml/spi.sgml +++ b/doc/src/sgml/spi.sgml @@ -1105,6 +1105,11 @@ SPIPlanPtr SPI_prepare_cursor(const char * <parameter>command</parameter>, int < for the <structfield>options</structfield> field of <structname>DeclareCursorStmt</structname>. <function>SPI_prepare</function> always takes the cursor options as zero. </para> + + <para> + This function is now deprecated in favor + of <function>SPI_prepare_extended</function>. + </para> </refsect1> <refsect1> @@ -1176,6 +1181,122 @@ SPIPlanPtr SPI_prepare_cursor(const char * <parameter>command</parameter>, int < <!-- *********************************************** --> +<refentry id="spi-spi-prepare-extended"> + <indexterm><primary>SPI_prepare_extended</primary></indexterm> + + <refmeta> + <refentrytitle>SPI_prepare_extended</refentrytitle> + <manvolnum>3</manvolnum> + </refmeta> + + <refnamediv> + <refname>SPI_prepare_extended</refname> + <refpurpose>prepare a statement, without executing it yet</refpurpose> + </refnamediv> + + <refsynopsisdiv> +<synopsis> +SPIPlanPtr SPI_prepare_extended(const char * <parameter>command</parameter>, + const SPIPrepareOptions * <parameter>options</parameter>) +</synopsis> + </refsynopsisdiv> + + <refsect1> + <title>Description</title> + + <para> + <function>SPI_prepare_extended</function> creates and returns a prepared + statement for the specified command, but doesn't execute the command. + This function is equivalent to <function>SPI_prepare</function>, + with the addition that the caller can specify options to control + the parsing of external parameter references, as well as other facets + of query parsing and planning. + </para> + </refsect1> + + <refsect1> + <title>Arguments</title> + + <variablelist> + <varlistentry> + <term><literal>const char * <parameter>command</parameter></literal></term> + <listitem> + <para> + command string + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>const SPIPrepareOptions * <parameter>options</parameter></literal></term> + <listitem> + <para> + struct containing optional arguments + </para> + </listitem> + </varlistentry> + </variablelist> + + <para> + Callers should always zero out the entire <parameter>options</parameter> + struct, then fill whichever fields they want to set. This ensures forward + compatibility of code, since any fields that are added to the struct in + future will be defined to behave backwards-compatibly if they are zero. + The currently available <parameter>options</parameter> fields are: + </para> + + <variablelist> + <varlistentry> + <term><literal>ParserSetupHook <parameter>parserSetup</parameter></literal></term> + <listitem> + <para> + Parser hook setup function + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>void * <parameter>parserSetupArg</parameter></literal></term> + <listitem> + <para> + pass-through argument for <parameter>parserSetup</parameter> + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>RawParseMode <parameter>parseMode</parameter></literal></term> + <listitem> + <para> + mode for raw parsing; <literal>RAW_PARSE_DEFAULT</literal> (zero) + produces default behavior + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>int <parameter>cursorOptions</parameter></literal></term> + <listitem> + <para> + integer bit mask of cursor options; zero produces default behavior + </para> + </listitem> + </varlistentry> + </variablelist> + </refsect1> + + <refsect1> + <title>Return Value</title> + + <para> + <function>SPI_prepare_extended</function> has the same return conventions as + <function>SPI_prepare</function>. + </para> + </refsect1> +</refentry> + +<!-- *********************************************** --> + <refentry id="spi-spi-prepare-params"> <indexterm><primary>SPI_prepare_params</primary></indexterm> @@ -1208,6 +1329,11 @@ SPIPlanPtr SPI_prepare_params(const char * <parameter>command</parameter>, with the addition that the caller can specify parser hook functions to control the parsing of external parameter references. </para> + + <para> + This function is now deprecated in favor + of <function>SPI_prepare_extended</function>. + </para> </refsect1> <refsect1> diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index 055ebb77ae..3acf88e67c 100644 --- a/src/backend/executor/spi.c +++ b/src/backend/executor/spi.c @@ -508,6 +508,7 @@ SPI_execute(const char *src, bool read_only, long tcount) memset(&plan, 0, sizeof(_SPI_plan)); plan.magic = _SPI_PLAN_MAGIC; + plan.parse_mode = RAW_PARSE_DEFAULT; plan.cursor_options = CURSOR_OPT_PARALLEL_OK; _SPI_prepare_oneshot_plan(src, &plan); @@ -681,6 +682,7 @@ SPI_execute_with_args(const char *src, memset(&plan, 0, sizeof(_SPI_plan)); plan.magic = _SPI_PLAN_MAGIC; + plan.parse_mode = RAW_PARSE_DEFAULT; plan.cursor_options = CURSOR_OPT_PARALLEL_OK; plan.nargs = nargs; plan.argtypes = argtypes; @@ -726,6 +728,7 @@ SPI_execute_with_receiver(const char *src, memset(&plan, 0, sizeof(_SPI_plan)); plan.magic = _SPI_PLAN_MAGIC; + plan.parse_mode = RAW_PARSE_DEFAULT; plan.cursor_options = CURSOR_OPT_PARALLEL_OK; if (params) { @@ -768,6 +771,7 @@ SPI_prepare_cursor(const char *src, int nargs, Oid *argtypes, memset(&plan, 0, sizeof(_SPI_plan)); plan.magic = _SPI_PLAN_MAGIC; + plan.parse_mode = RAW_PARSE_DEFAULT; plan.cursor_options = cursorOptions; plan.nargs = nargs; plan.argtypes = argtypes; @@ -784,6 +788,42 @@ SPI_prepare_cursor(const char *src, int nargs, Oid *argtypes, return result; } +SPIPlanPtr +SPI_prepare_extended(const char *src, + const SPIPrepareOptions *options) +{ + _SPI_plan plan; + SPIPlanPtr result; + + if (src == NULL || options == NULL) + { + SPI_result = SPI_ERROR_ARGUMENT; + return NULL; + } + + SPI_result = _SPI_begin_call(true); + if (SPI_result < 0) + return NULL; + + memset(&plan, 0, sizeof(_SPI_plan)); + plan.magic = _SPI_PLAN_MAGIC; + plan.parse_mode = options->parseMode; + plan.cursor_options = options->cursorOptions; + plan.nargs = 0; + plan.argtypes = NULL; + plan.parserSetup = options->parserSetup; + plan.parserSetupArg = options->parserSetupArg; + + _SPI_prepare_plan(src, &plan); + + /* copy plan to procedure context */ + result = _SPI_make_plan_non_temp(&plan); + + _SPI_end_call(true); + + return result; +} + SPIPlanPtr SPI_prepare_params(const char *src, ParserSetupHook parserSetup, @@ -805,6 +845,7 @@ SPI_prepare_params(const char *src, memset(&plan, 0, sizeof(_SPI_plan)); plan.magic = _SPI_PLAN_MAGIC; + plan.parse_mode = RAW_PARSE_DEFAULT; plan.cursor_options = cursorOptions; plan.nargs = 0; plan.argtypes = NULL; @@ -1340,6 +1381,7 @@ SPI_cursor_open_with_args(const char *name, memset(&plan, 0, sizeof(_SPI_plan)); plan.magic = _SPI_PLAN_MAGIC; + plan.parse_mode = RAW_PARSE_DEFAULT; plan.cursor_options = cursorOptions; plan.nargs = nargs; plan.argtypes = argtypes; @@ -1400,6 +1442,7 @@ SPI_cursor_parse_open_with_paramlist(const char *name, memset(&plan, 0, sizeof(_SPI_plan)); plan.magic = _SPI_PLAN_MAGIC; + plan.parse_mode = RAW_PARSE_DEFAULT; plan.cursor_options = cursorOptions; if (params) { @@ -2036,7 +2079,8 @@ spi_printtup(TupleTableSlot *slot, DestReceiver *self) * Parse and analyze a querystring. * * At entry, plan->argtypes and plan->nargs (or alternatively plan->parserSetup - * and plan->parserSetupArg) must be valid, as must plan->cursor_options. + * and plan->parserSetupArg) must be valid, as must plan->parse_mode and + * plan->cursor_options. * * Results are stored into *plan (specifically, plan->plancache_list). * Note that the result data is all in CurrentMemoryContext or child contexts @@ -2063,7 +2107,7 @@ _SPI_prepare_plan(const char *src, SPIPlanPtr plan) /* * Parse the request string into a list of raw parse trees. */ - raw_parsetree_list = pg_parse_query(src); + raw_parsetree_list = raw_parser(src, plan->parse_mode); /* * Do parse analysis and rule rewrite for each raw parsetree, storing the @@ -2168,7 +2212,7 @@ _SPI_prepare_oneshot_plan(const char *src, SPIPlanPtr plan) /* * Parse the request string into a list of raw parse trees. */ - raw_parsetree_list = pg_parse_query(src); + raw_parsetree_list = raw_parser(src, plan->parse_mode); /* * Construct plancache entries, but don't do parse analysis yet. @@ -2866,6 +2910,7 @@ _SPI_make_plan_non_temp(SPIPlanPtr plan) newplan = (SPIPlanPtr) palloc0(sizeof(_SPI_plan)); newplan->magic = _SPI_PLAN_MAGIC; newplan->plancxt = plancxt; + newplan->parse_mode = plan->parse_mode; newplan->cursor_options = plan->cursor_options; newplan->nargs = plan->nargs; if (plan->nargs > 0) @@ -2930,6 +2975,7 @@ _SPI_save_plan(SPIPlanPtr plan) newplan = (SPIPlanPtr) palloc0(sizeof(_SPI_plan)); newplan->magic = _SPI_PLAN_MAGIC; newplan->plancxt = plancxt; + newplan->parse_mode = plan->parse_mode; newplan->cursor_options = plan->cursor_options; newplan->nargs = plan->nargs; if (plan->nargs > 0) diff --git a/src/include/executor/spi.h b/src/include/executor/spi.h index 896ec0a2ad..8ff4236c8f 100644 --- a/src/include/executor/spi.h +++ b/src/include/executor/spi.h @@ -15,7 +15,7 @@ #include "commands/trigger.h" #include "lib/ilist.h" -#include "nodes/parsenodes.h" +#include "parser/parser.h" #include "utils/portal.h" @@ -33,6 +33,15 @@ typedef struct SPITupleTable SubTransactionId subid; /* subxact in which tuptable was created */ } SPITupleTable; +/* Optional arguments for SPI_prepare_extended */ +typedef struct SPIPrepareOptions +{ + ParserSetupHook parserSetup; + void *parserSetupArg; + RawParseMode parseMode; + int cursorOptions; +} SPIPrepareOptions; + /* Plans are opaque structs for standard users of SPI */ typedef struct _SPI_plan *SPIPlanPtr; @@ -113,6 +122,8 @@ extern int SPI_execute_with_receiver(const char *src, extern SPIPlanPtr SPI_prepare(const char *src, int nargs, Oid *argtypes); extern SPIPlanPtr SPI_prepare_cursor(const char *src, int nargs, Oid *argtypes, int cursorOptions); +extern SPIPlanPtr SPI_prepare_extended(const char *src, + const SPIPrepareOptions *options); extern SPIPlanPtr SPI_prepare_params(const char *src, ParserSetupHook parserSetup, void *parserSetupArg, diff --git a/src/include/executor/spi_priv.h b/src/include/executor/spi_priv.h index 6220928bd3..9c99aeb84e 100644 --- a/src/include/executor/spi_priv.h +++ b/src/include/executor/spi_priv.h @@ -95,6 +95,7 @@ typedef struct _SPI_plan bool no_snapshots; /* let the caller handle the snapshots */ List *plancache_list; /* one CachedPlanSource per parsetree */ MemoryContext plancxt; /* Context containing _SPI_plan and data */ + RawParseMode parse_mode; /* raw_parser() mode */ int cursor_options; /* Cursor options used for planning */ int nargs; /* number of plan arguments */ Oid *argtypes; /* Argument types (NULL if nargs is 0) */ diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index ccbc50fc45..86f5e9fd24 100644 --- a/src/pl/plpgsql/src/pl_exec.c +++ b/src/pl/plpgsql/src/pl_exec.c @@ -4170,6 +4170,7 @@ exec_prepare_plan(PLpgSQL_execstate *estate, bool keepplan) { SPIPlanPtr plan; + SPIPrepareOptions options; /* * The grammar can't conveniently set expr->func while building the parse @@ -4180,12 +4181,14 @@ exec_prepare_plan(PLpgSQL_execstate *estate, /* * Generate and save the plan */ - plan = SPI_prepare_params(expr->query, - (ParserSetupHook) plpgsql_parser_setup, - (void *) expr, - cursorOptions); + memset(&options, 0, sizeof(options)); + options.parserSetup = (ParserSetupHook) plpgsql_parser_setup; + options.parserSetupArg = (void *) expr; + options.parseMode = RAW_PARSE_DEFAULT; + options.cursorOptions = cursorOptions; + plan = SPI_prepare_extended(expr->query, &options); if (plan == NULL) - elog(ERROR, "SPI_prepare_params failed for \"%s\": %s", + elog(ERROR, "SPI_prepare_extended failed for \"%s\": %s", expr->query, SPI_result_code_string(SPI_result)); if (keepplan) SPI_keepplan(plan); diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index 3acf88e67c..f529707458 100644 --- a/src/backend/executor/spi.c +++ b/src/backend/executor/spi.c @@ -51,6 +51,12 @@ static _SPI_connection *_SPI_current = NULL; static int _SPI_stack_depth = 0; /* allocated size of _SPI_stack */ static int _SPI_connected = -1; /* current stack index */ +typedef struct SPICallbackArg +{ + const char *query; + RawParseMode mode; +} SPICallbackArg; + static Portal SPI_cursor_open_internal(const char *name, SPIPlanPtr plan, ParamListInfo paramLI, bool read_only); @@ -1479,6 +1485,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan, Snapshot snapshot; MemoryContext oldcontext; Portal portal; + SPICallbackArg spicallbackarg; ErrorContextCallback spierrcontext; /* @@ -1533,8 +1540,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan, * Setup error traceback support for ereport(), in case GetCachedPlan * throws an error. */ + spicallbackarg.query = plansource->query_string; + spicallbackarg.mode = plan->parse_mode; spierrcontext.callback = _SPI_error_callback; - spierrcontext.arg = unconstify(char *, plansource->query_string); + spierrcontext.arg = &spicallbackarg; spierrcontext.previous = error_context_stack; error_context_stack = &spierrcontext; @@ -1952,6 +1961,7 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan) { CachedPlanSource *plansource; CachedPlan *cplan; + SPICallbackArg spicallbackarg; ErrorContextCallback spierrcontext; Assert(plan->magic == _SPI_PLAN_MAGIC); @@ -1966,8 +1976,10 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan) plansource = (CachedPlanSource *) linitial(plan->plancache_list); /* Setup error traceback support for ereport() */ + spicallbackarg.query = plansource->query_string; + spicallbackarg.mode = plan->parse_mode; spierrcontext.callback = _SPI_error_callback; - spierrcontext.arg = unconstify(char *, plansource->query_string); + spierrcontext.arg = &spicallbackarg; spierrcontext.previous = error_context_stack; error_context_stack = &spierrcontext; @@ -2094,13 +2106,16 @@ _SPI_prepare_plan(const char *src, SPIPlanPtr plan) List *raw_parsetree_list; List *plancache_list; ListCell *list_item; + SPICallbackArg spicallbackarg; ErrorContextCallback spierrcontext; /* * Setup error traceback support for ereport() */ + spicallbackarg.query = src; + spicallbackarg.mode = plan->parse_mode; spierrcontext.callback = _SPI_error_callback; - spierrcontext.arg = unconstify(char *, src); + spierrcontext.arg = &spicallbackarg; spierrcontext.previous = error_context_stack; error_context_stack = &spierrcontext; @@ -2199,13 +2214,16 @@ _SPI_prepare_oneshot_plan(const char *src, SPIPlanPtr plan) List *raw_parsetree_list; List *plancache_list; ListCell *list_item; + SPICallbackArg spicallbackarg; ErrorContextCallback spierrcontext; /* * Setup error traceback support for ereport() */ + spicallbackarg.query = src; + spicallbackarg.mode = plan->parse_mode; spierrcontext.callback = _SPI_error_callback; - spierrcontext.arg = unconstify(char *, src); + spierrcontext.arg = &spicallbackarg; spierrcontext.previous = error_context_stack; error_context_stack = &spierrcontext; @@ -2263,6 +2281,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI, SPITupleTable *my_tuptable = NULL; int res = 0; bool pushed_active_snap = false; + SPICallbackArg spicallbackarg; ErrorContextCallback spierrcontext; CachedPlan *cplan = NULL; ListCell *lc1; @@ -2270,8 +2289,10 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI, /* * Setup error traceback support for ereport() */ + spicallbackarg.query = NULL; /* we'll fill this below */ + spicallbackarg.mode = plan->parse_mode; spierrcontext.callback = _SPI_error_callback; - spierrcontext.arg = NULL; /* we'll fill this below */ + spierrcontext.arg = &spicallbackarg; spierrcontext.previous = error_context_stack; error_context_stack = &spierrcontext; @@ -2318,7 +2339,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI, List *stmt_list; ListCell *lc2; - spierrcontext.arg = unconstify(char *, plansource->query_string); + spicallbackarg.query = plansource->query_string; /* * If this is a one-shot plan, we still need to do parse analysis. @@ -2722,7 +2743,8 @@ _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount) static void _SPI_error_callback(void *arg) { - const char *query = (const char *) arg; + SPICallbackArg *carg = (SPICallbackArg *) arg; + const char *query = carg->query; int syntaxerrposition; if (query == NULL) /* in case arg wasn't set yet */ @@ -2740,7 +2762,18 @@ _SPI_error_callback(void *arg) internalerrquery(query); } else - errcontext("SQL statement \"%s\"", query); + { + /* Use the parse mode to decide how to describe the query */ + switch (carg->mode) + { + case RAW_PARSE_PLPGSQL_EXPR: + errcontext("SQL expression \"%s\"", query); + break; + default: + errcontext("SQL statement \"%s\"", query); + break; + } + } } /* diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 88c76dd985..4c58b46651 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -294,6 +294,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); %type <node> select_no_parens select_with_parens select_clause simple_select values_clause + PLpgSQL_Expr %type <node> alter_column_default opclass_item opclass_drop alter_using %type <ival> add_drop opt_asc_desc opt_nulls_order @@ -731,6 +732,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); * something other than the usual list of SQL commands. */ %token MODE_TYPE_NAME +%token MODE_PLPGSQL_EXPR /* Precedence: lowest to highest */ @@ -808,6 +810,11 @@ stmtblock: stmtmulti { pg_yyget_extra(yyscanner)->parsetree = list_make1($2); } + | MODE_PLPGSQL_EXPR PLpgSQL_Expr + { + pg_yyget_extra(yyscanner)->parsetree = + list_make1(makeRawStmt($2, 0)); + } ; /* @@ -15022,6 +15029,47 @@ role_list: RoleSpec { $$ = lappend($1, $3); } ; + +/***************************************************************************** + * + * PL/pgSQL extensions + * + * You'd think a PL/pgSQL "expression" should be just an a_expr, but + * historically it can include just about anything that can follow SELECT. + * Therefore the returned struct is a SelectStmt. + *****************************************************************************/ + +PLpgSQL_Expr: opt_target_list + from_clause where_clause + group_clause having_clause window_clause + opt_sort_clause opt_select_limit opt_for_locking_clause + { + SelectStmt *n = makeNode(SelectStmt); + + n->targetList = $1; + n->fromClause = $2; + n->whereClause = $3; + n->groupClause = $4; + n->havingClause = $5; + n->windowClause = $6; + n->sortClause = $7; + if ($8) + { + n->limitOffset = $8->limitOffset; + n->limitCount = $8->limitCount; + if (!n->sortClause && + $8->limitOption == LIMIT_OPTION_WITH_TIES) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("WITH TIES cannot be specified without ORDER BY clause"))); + n->limitOption = $8->limitOption; + } + n->lockingClause = $9; + $$ = (Node *) n; + } + ; + + /* * Name classification hierarchy. * diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c index b04be3bca5..71df8ef022 100644 --- a/src/backend/parser/parser.c +++ b/src/backend/parser/parser.c @@ -57,7 +57,8 @@ raw_parser(const char *str, RawParseMode mode) /* this array is indexed by RawParseMode enum */ static const int mode_token[] = { 0, /* RAW_PARSE_DEFAULT */ - MODE_TYPE_NAME /* RAW_PARSE_TYPE_NAME */ + MODE_TYPE_NAME, /* RAW_PARSE_TYPE_NAME */ + MODE_PLPGSQL_EXPR /* RAW_PARSE_PLPGSQL_EXPR */ }; yyextra.have_lookahead = true; diff --git a/src/include/parser/parser.h b/src/include/parser/parser.h index ebbcb788cc..3b7dab17ef 100644 --- a/src/include/parser/parser.h +++ b/src/include/parser/parser.h @@ -27,12 +27,14 @@ * RAW_PARSE_TYPE_NAME: parse a type name, and return a one-element List * containing a TypeName node. * - * ... more to come ... + * RAW_PARSE_PLPGSQL_EXPR: parse a PL/pgSQL expression, and return + * a one-element List containing a RawStmt node. */ typedef enum { RAW_PARSE_DEFAULT = 0, - RAW_PARSE_TYPE_NAME + RAW_PARSE_TYPE_NAME, + RAW_PARSE_PLPGSQL_EXPR } RawParseMode; /* Values for the backslash_quote GUC */ diff --git a/src/interfaces/ecpg/preproc/parse.pl b/src/interfaces/ecpg/preproc/parse.pl index 52ba7dfa0c..2b254c2b77 100644 --- a/src/interfaces/ecpg/preproc/parse.pl +++ b/src/interfaces/ecpg/preproc/parse.pl @@ -70,7 +70,8 @@ my %replace_types = ( 'ColId' => 'ignore', 'type_function_name' => 'ignore', 'ColLabel' => 'ignore', - 'Sconst' => 'ignore',); + 'Sconst' => 'ignore', + 'PLpgSQL_Expr' => 'ignore',); # these replace_line commands excise certain keywords from the core keyword # lists. Be sure to account for these in ColLabel and related productions. diff --git a/src/pl/plpgsql/src/expected/plpgsql_record.out b/src/pl/plpgsql/src/expected/plpgsql_record.out index cf6089cbb2..52207e9b10 100644 --- a/src/pl/plpgsql/src/expected/plpgsql_record.out +++ b/src/pl/plpgsql/src/expected/plpgsql_record.out @@ -188,7 +188,7 @@ NOTICE: r1.q1 = <NULL> NOTICE: r1.q2 = <NULL> NOTICE: r1 = <NULL> ERROR: record "r1" has no field "nosuchfield" -CONTEXT: SQL statement "SELECT r1.nosuchfield" +CONTEXT: SQL expression "r1.nosuchfield" PL/pgSQL function inline_code_block line 7 at RAISE -- records, not so much do $$ @@ -202,7 +202,7 @@ end$$; NOTICE: r1 = <NULL> ERROR: record "r1" is not assigned yet DETAIL: The tuple structure of a not-yet-assigned record is indeterminate. -CONTEXT: SQL statement "SELECT r1.f1" +CONTEXT: SQL expression "r1.f1" PL/pgSQL function inline_code_block line 5 at RAISE -- but OK if you assign first do $$ @@ -220,7 +220,7 @@ NOTICE: r1.f1 = 1 NOTICE: r1.f2 = 2 NOTICE: r1 = (1,2) ERROR: record "r1" has no field "nosuchfield" -CONTEXT: SQL statement "SELECT r1.nosuchfield" +CONTEXT: SQL expression "r1.nosuchfield" PL/pgSQL function inline_code_block line 9 at RAISE -- check repeated assignments to composite fields create table some_table (id int, data text); @@ -431,7 +431,7 @@ create function getf3(x mutable) returns int language plpgsql as $$ begin return x.f3; end $$; select getf3(null::mutable); -- doesn't work yet ERROR: record "x" has no field "f3" -CONTEXT: SQL statement "SELECT x.f3" +CONTEXT: SQL expression "x.f3" PL/pgSQL function getf3(mutable) line 1 at RETURN alter table mutable add column f3 int; select getf3(null::mutable); -- now it works diff --git a/src/pl/plpgsql/src/expected/plpgsql_varprops.out b/src/pl/plpgsql/src/expected/plpgsql_varprops.out index 18f03d75b4..3801dccc95 100644 --- a/src/pl/plpgsql/src/expected/plpgsql_varprops.out +++ b/src/pl/plpgsql/src/expected/plpgsql_varprops.out @@ -76,7 +76,7 @@ begin raise notice 'x = %', x; end$$; ERROR: division by zero -CONTEXT: SQL statement "SELECT 1/0" +CONTEXT: SQL expression "1/0" PL/pgSQL function inline_code_block line 3 during statement block local variable initialization do $$ declare x bigint[] := array[1,3,5]; diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index 86f5e9fd24..f56dcd0e79 100644 --- a/src/pl/plpgsql/src/pl_exec.c +++ b/src/pl/plpgsql/src/pl_exec.c @@ -4184,7 +4184,7 @@ exec_prepare_plan(PLpgSQL_execstate *estate, memset(&options, 0, sizeof(options)); options.parserSetup = (ParserSetupHook) plpgsql_parser_setup; options.parserSetupArg = (void *) expr; - options.parseMode = RAW_PARSE_DEFAULT; + options.parseMode = expr->parseMode; options.cursorOptions = cursorOptions; plan = SPI_prepare_extended(expr->query, &options); if (plan == NULL) diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y index 26732d875e..3b36220d73 100644 --- a/src/pl/plpgsql/src/pl_gram.y +++ b/src/pl/plpgsql/src/pl_gram.y @@ -51,7 +51,6 @@ typedef struct { int location; - int leaderlen; } sql_error_callback_arg; #define parser_errposition(pos) plpgsql_scanner_errposition(pos) @@ -67,7 +66,7 @@ static PLpgSQL_expr *read_sql_construct(int until, int until2, int until3, const char *expected, - const char *sqlstart, + RawParseMode parsemode, bool isexpression, bool valid_sql, bool trim, @@ -78,7 +77,7 @@ static PLpgSQL_expr *read_sql_expression(int until, static PLpgSQL_expr *read_sql_expression2(int until, int until2, const char *expected, int *endtoken); -static PLpgSQL_expr *read_sql_stmt(const char *sqlstart); +static PLpgSQL_expr *read_sql_stmt(void); static PLpgSQL_type *read_datatype(int tok); static PLpgSQL_stmt *make_execsql_stmt(int firsttoken, int location); static PLpgSQL_stmt_fetch *read_fetch_direction(void); @@ -99,8 +98,8 @@ static PLpgSQL_row *read_into_scalar_list(char *initial_name, static PLpgSQL_row *make_scalar_list1(char *initial_name, PLpgSQL_datum *initial_datum, int lineno, int location); -static void check_sql_expr(const char *stmt, int location, - int leaderlen); +static void check_sql_expr(const char *stmt, + RawParseMode parseMode, int location); static void plpgsql_sql_error_callback(void *arg); static PLpgSQL_type *parse_datatype(const char *string, int location); static void check_labels(const char *start_label, @@ -540,7 +539,7 @@ decl_statement : decl_varname decl_const decl_datatype decl_collate decl_notnull { PLpgSQL_var *new; PLpgSQL_expr *curname_def; - char buf[1024]; + char buf[NAMEDATALEN * 2 + 64]; char *cp1; char *cp2; @@ -557,9 +556,9 @@ decl_statement : decl_varname decl_const decl_datatype decl_collate decl_notnull curname_def = palloc0(sizeof(PLpgSQL_expr)); - strcpy(buf, "SELECT "); + /* Note: refname has been truncated to NAMEDATALEN */ cp1 = new->refname; - cp2 = buf + strlen(buf); + cp2 = buf; /* * Don't trust standard_conforming_strings here; * it might change before we use the string. @@ -575,6 +574,7 @@ decl_statement : decl_varname decl_const decl_datatype decl_collate decl_notnull } strcpy(cp2, "'::pg_catalog.refcursor"); curname_def->query = pstrdup(buf); + curname_def->parseMode = RAW_PARSE_PLPGSQL_EXPR; new->default_val = curname_def; new->cursor_explicit_expr = $7; @@ -602,7 +602,7 @@ opt_scrollable : decl_cursor_query : { - $$ = read_sql_stmt(""); + $$ = read_sql_stmt(); } ; @@ -904,15 +904,37 @@ proc_stmt : pl_block ';' { $$ = $1; } ; -stmt_perform : K_PERFORM expr_until_semi +stmt_perform : K_PERFORM { PLpgSQL_stmt_perform *new; + int startloc; new = palloc0(sizeof(PLpgSQL_stmt_perform)); new->cmd_type = PLPGSQL_STMT_PERFORM; new->lineno = plpgsql_location_to_lineno(@1); new->stmtid = ++plpgsql_curr_compile->nstatements; - new->expr = $2; + plpgsql_push_back_token(K_PERFORM); + + /* + * Since PERFORM isn't legal SQL, we have to cheat to + * the extent of substituting "SELECT" for "PERFORM" + * in the parsed text. It does not seem worth + * inventing a separate parse mode for this one case. + * We can't do syntax-checking until after we make the + * substitution. + */ + new->expr = read_sql_construct(';', 0, 0, ";", + RAW_PARSE_DEFAULT, + false, false, true, + &startloc, NULL); + /* overwrite "perform" ... */ + memcpy(new->expr->query, " SELECT", 7); + /* left-justify to get rid of the leading space */ + memmove(new->expr->query, new->expr->query + 1, + strlen(new->expr->query)); + /* offset syntax error position to account for that */ + check_sql_expr(new->expr->query, new->expr->parseMode, + startloc + 1); $$ = (PLpgSQL_stmt *)new; } @@ -926,7 +948,8 @@ stmt_call : K_CALL new->cmd_type = PLPGSQL_STMT_CALL; new->lineno = plpgsql_location_to_lineno(@1); new->stmtid = ++plpgsql_curr_compile->nstatements; - new->expr = read_sql_stmt("CALL "); + plpgsql_push_back_token(K_CALL); + new->expr = read_sql_stmt(); new->is_call = true; $$ = (PLpgSQL_stmt *)new; @@ -941,7 +964,8 @@ stmt_call : K_CALL new->cmd_type = PLPGSQL_STMT_CALL; new->lineno = plpgsql_location_to_lineno(@1); new->stmtid = ++plpgsql_curr_compile->nstatements; - new->expr = read_sql_stmt("DO "); + plpgsql_push_back_token(K_DO); + new->expr = read_sql_stmt(); new->is_call = false; $$ = (PLpgSQL_stmt *)new; @@ -1452,16 +1476,16 @@ for_control : for_variable K_IN /* * Read tokens until we see either a ".." - * or a LOOP. The text we read may not - * necessarily be a well-formed SQL - * statement, so we need to invoke - * read_sql_construct directly. + * or a LOOP. The text we read may be either + * an expression or a whole SQL statement, so + * we need to invoke read_sql_construct directly, + * and tell it not to check syntax yet. */ expr1 = read_sql_construct(DOT_DOT, K_LOOP, 0, "LOOP", - "SELECT ", + RAW_PARSE_DEFAULT, true, false, true, @@ -1476,8 +1500,13 @@ for_control : for_variable K_IN PLpgSQL_var *fvar; PLpgSQL_stmt_fori *new; - /* Check first expression is well-formed */ - check_sql_expr(expr1->query, expr1loc, 7); + /* + * Relabel first expression as an expression; + * then we can check its syntax. + */ + expr1->parseMode = RAW_PARSE_PLPGSQL_EXPR; + check_sql_expr(expr1->query, expr1->parseMode, + expr1loc); /* Read and check the second one */ expr2 = read_sql_expression2(K_LOOP, K_BY, @@ -1522,12 +1551,8 @@ for_control : for_variable K_IN else { /* - * No "..", so it must be a query loop. We've - * prefixed an extra SELECT to the query text, - * so we need to remove that before performing - * syntax checking. + * No "..", so it must be a query loop. */ - char *tmp_query; PLpgSQL_stmt_fors *new; if (reverse) @@ -1536,12 +1561,9 @@ for_control : for_variable K_IN errmsg("cannot specify REVERSE in query FOR loop"), parser_errposition(tokloc))); - Assert(strncmp(expr1->query, "SELECT ", 7) == 0); - tmp_query = pstrdup(expr1->query + 7); - pfree(expr1->query); - expr1->query = tmp_query; - - check_sql_expr(expr1->query, expr1loc, 0); + /* Check syntax as a regular query */ + check_sql_expr(expr1->query, expr1->parseMode, + expr1loc); new = palloc0(sizeof(PLpgSQL_stmt_fors)); new->cmd_type = PLPGSQL_STMT_FORS; @@ -1870,7 +1892,7 @@ stmt_raise : K_RAISE expr = read_sql_construct(',', ';', K_USING, ", or ; or USING", - "SELECT ", + RAW_PARSE_PLPGSQL_EXPR, true, true, true, NULL, &tok); new->params = lappend(new->params, expr); @@ -2001,7 +2023,7 @@ stmt_dynexecute : K_EXECUTE expr = read_sql_construct(K_INTO, K_USING, ';', "INTO or USING or ;", - "SELECT ", + RAW_PARSE_PLPGSQL_EXPR, true, true, true, NULL, &endtoken); @@ -2040,7 +2062,7 @@ stmt_dynexecute : K_EXECUTE { expr = read_sql_construct(',', ';', K_INTO, ", or ; or INTO", - "SELECT ", + RAW_PARSE_PLPGSQL_EXPR, true, true, true, NULL, &endtoken); new->params = lappend(new->params, expr); @@ -2122,7 +2144,7 @@ stmt_open : K_OPEN cursor_variable else { plpgsql_push_back_token(tok); - new->query = read_sql_stmt(""); + new->query = read_sql_stmt(); } } else @@ -2246,8 +2268,8 @@ stmt_set : K_SET new->cmd_type = PLPGSQL_STMT_SET; new->lineno = plpgsql_location_to_lineno(@1); new->stmtid = ++plpgsql_curr_compile->nstatements; - - new->expr = read_sql_stmt("SET "); + plpgsql_push_back_token(K_SET); + new->expr = read_sql_stmt(); $$ = (PLpgSQL_stmt *)new; } @@ -2259,7 +2281,8 @@ stmt_set : K_SET new->cmd_type = PLPGSQL_STMT_SET; new->lineno = plpgsql_location_to_lineno(@1); new->stmtid = ++plpgsql_curr_compile->nstatements; - new->expr = read_sql_stmt("RESET "); + plpgsql_push_back_token(K_RESET); + new->expr = read_sql_stmt(); $$ = (PLpgSQL_stmt *)new; } @@ -2656,7 +2679,8 @@ static PLpgSQL_expr * read_sql_expression(int until, const char *expected) { return read_sql_construct(until, 0, 0, expected, - "SELECT ", true, true, true, NULL, NULL); + RAW_PARSE_PLPGSQL_EXPR, + true, true, true, NULL, NULL); } /* Convenience routine to read an expression with two possible terminators */ @@ -2665,15 +2689,17 @@ read_sql_expression2(int until, int until2, const char *expected, int *endtoken) { return read_sql_construct(until, until2, 0, expected, - "SELECT ", true, true, true, NULL, endtoken); + RAW_PARSE_PLPGSQL_EXPR, + true, true, true, NULL, endtoken); } /* Convenience routine to read a SQL statement that must end with ';' */ static PLpgSQL_expr * -read_sql_stmt(const char *sqlstart) +read_sql_stmt(void) { return read_sql_construct(';', 0, 0, ";", - sqlstart, false, true, true, NULL, NULL); + RAW_PARSE_DEFAULT, + false, true, true, NULL, NULL); } /* @@ -2683,9 +2709,9 @@ read_sql_stmt(const char *sqlstart) * until2: token code for alternate terminator (pass 0 if none) * until3: token code for another alternate terminator (pass 0 if none) * expected: text to use in complaining that terminator was not found - * sqlstart: text to prefix to the accumulated SQL text + * parsemode: raw_parser() mode to use * isexpression: whether to say we're reading an "expression" or a "statement" - * valid_sql: whether to check the syntax of the expr (prefixed with sqlstart) + * valid_sql: whether to check the syntax of the expr * trim: trim trailing whitespace * startloc: if not NULL, location of first token is stored at *startloc * endtoken: if not NULL, ending token is stored at *endtoken @@ -2696,7 +2722,7 @@ read_sql_construct(int until, int until2, int until3, const char *expected, - const char *sqlstart, + RawParseMode parsemode, bool isexpression, bool valid_sql, bool trim, @@ -2711,7 +2737,6 @@ read_sql_construct(int until, PLpgSQL_expr *expr; initStringInfo(&ds); - appendStringInfoString(&ds, sqlstart); /* special lookup mode for identifiers within the SQL text */ save_IdentifierLookup = plpgsql_IdentifierLookup; @@ -2787,6 +2812,7 @@ read_sql_construct(int until, expr = palloc0(sizeof(PLpgSQL_expr)); expr->query = pstrdup(ds.data); + expr->parseMode = parsemode; expr->plan = NULL; expr->paramnos = NULL; expr->rwparam = -1; @@ -2794,7 +2820,7 @@ read_sql_construct(int until, pfree(ds.data); if (valid_sql) - check_sql_expr(expr->query, startlocation, strlen(sqlstart)); + check_sql_expr(expr->query, expr->parseMode, startlocation); return expr; } @@ -3033,13 +3059,14 @@ make_execsql_stmt(int firsttoken, int location) expr = palloc0(sizeof(PLpgSQL_expr)); expr->query = pstrdup(ds.data); + expr->parseMode = RAW_PARSE_DEFAULT; expr->plan = NULL; expr->paramnos = NULL; expr->rwparam = -1; expr->ns = plpgsql_ns_top(); pfree(ds.data); - check_sql_expr(expr->query, location, 0); + check_sql_expr(expr->query, expr->parseMode, location); execsql = palloc(sizeof(PLpgSQL_stmt_execsql)); execsql->cmd_type = PLPGSQL_STMT_EXECSQL; @@ -3382,7 +3409,7 @@ make_return_query_stmt(int location) { /* ordinary static query */ plpgsql_push_back_token(tok); - new->query = read_sql_stmt(""); + new->query = read_sql_stmt(); } else { @@ -3637,13 +3664,12 @@ make_scalar_list1(char *initial_name, * borders. So it is best to bail out as early as we can. * * It is assumed that "stmt" represents a copy of the function source text - * beginning at offset "location", with leader text of length "leaderlen" - * (typically "SELECT ") prefixed to the source text. We use this assumption - * to transpose any error cursor position back to the function source text. + * beginning at offset "location". We use this assumption to transpose + * any error cursor position back to the function source text. * If no error cursor is provided, we'll just point at "location". */ static void -check_sql_expr(const char *stmt, int location, int leaderlen) +check_sql_expr(const char *stmt, RawParseMode parseMode, int location) { sql_error_callback_arg cbarg; ErrorContextCallback syntax_errcontext; @@ -3653,7 +3679,6 @@ check_sql_expr(const char *stmt, int location, int leaderlen) return; cbarg.location = location; - cbarg.leaderlen = leaderlen; syntax_errcontext.callback = plpgsql_sql_error_callback; syntax_errcontext.arg = &cbarg; @@ -3661,7 +3686,7 @@ check_sql_expr(const char *stmt, int location, int leaderlen) error_context_stack = &syntax_errcontext; oldCxt = MemoryContextSwitchTo(plpgsql_compile_tmp_cxt); - (void) raw_parser(stmt, RAW_PARSE_DEFAULT); + (void) raw_parser(stmt, parseMode); MemoryContextSwitchTo(oldCxt); /* Restore former ereport callback */ @@ -3686,12 +3711,12 @@ plpgsql_sql_error_callback(void *arg) * Note we are dealing with 1-based character numbers at this point. */ errpos = geterrposition(); - if (errpos > cbarg->leaderlen) + if (errpos > 0) { int myerrpos = getinternalerrposition(); if (myerrpos > 0) /* safety check */ - internalerrposition(myerrpos + errpos - cbarg->leaderlen - 1); + internalerrposition(myerrpos + errpos - 1); } /* In any case, flush errposition --- we want internalerrposition only */ @@ -3717,7 +3742,6 @@ parse_datatype(const char *string, int location) ErrorContextCallback syntax_errcontext; cbarg.location = location; - cbarg.leaderlen = 0; syntax_errcontext.callback = plpgsql_sql_error_callback; syntax_errcontext.arg = &cbarg; @@ -3780,7 +3804,6 @@ read_cursor_args(PLpgSQL_var *cursor, int until) int argc; char **argv; StringInfoData ds; - char *sqlstart = "SELECT "; bool any_named = false; tok = yylex(); @@ -3881,12 +3904,12 @@ read_cursor_args(PLpgSQL_var *cursor, int until) */ item = read_sql_construct(',', ')', 0, ",\" or \")", - sqlstart, + RAW_PARSE_PLPGSQL_EXPR, true, true, false, /* do not trim */ NULL, &endtoken); - argv[argpos] = item->query + strlen(sqlstart); + argv[argpos] = item->query; if (endtoken == ')' && !(argc == row->nfields - 1)) ereport(ERROR, @@ -3905,7 +3928,6 @@ read_cursor_args(PLpgSQL_var *cursor, int until) /* Make positional argument list */ initStringInfo(&ds); - appendStringInfoString(&ds, sqlstart); for (argc = 0; argc < row->nfields; argc++) { Assert(argv[argc] != NULL); @@ -3921,10 +3943,10 @@ read_cursor_args(PLpgSQL_var *cursor, int until) if (argc < row->nfields - 1) appendStringInfoString(&ds, ", "); } - appendStringInfoChar(&ds, ';'); expr = palloc0(sizeof(PLpgSQL_expr)); expr->query = pstrdup(ds.data); + expr->parseMode = RAW_PARSE_PLPGSQL_EXPR; expr->plan = NULL; expr->paramnos = NULL; expr->rwparam = -1; @@ -4097,14 +4119,14 @@ make_case(int location, PLpgSQL_expr *t_expr, PLpgSQL_expr *expr = cwt->expr; StringInfoData ds; - /* copy expression query without SELECT keyword (expr->query + 7) */ - Assert(strncmp(expr->query, "SELECT ", 7) == 0); + /* We expect to have expressions not statements */ + Assert(expr->parseMode == RAW_PARSE_PLPGSQL_EXPR); - /* And do the string hacking */ + /* Do the string hacking */ initStringInfo(&ds); - appendStringInfo(&ds, "SELECT \"%s\" IN (%s)", - varname, expr->query + 7); + appendStringInfo(&ds, "\"%s\" IN (%s)", + varname, expr->query); pfree(expr->query); expr->query = pstrdup(ds.data); diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h index 0c3d30fb13..d152a4354b 100644 --- a/src/pl/plpgsql/src/plpgsql.h +++ b/src/pl/plpgsql/src/plpgsql.h @@ -218,8 +218,9 @@ typedef struct PLpgSQL_type */ typedef struct PLpgSQL_expr { - char *query; - SPIPlanPtr plan; + char *query; /* query string, verbatim from function body */ + RawParseMode parseMode; /* raw_parser() mode to use */ + SPIPlanPtr plan; /* plan, or NULL if not made yet */ Bitmapset *paramnos; /* all dnos referenced by this query */ int rwparam; /* dno of read/write param, or -1 if none */ diff --git a/src/test/regress/expected/plpgsql.out b/src/test/regress/expected/plpgsql.out index d0a6b630b8..6ae1b6e3d4 100644 --- a/src/test/regress/expected/plpgsql.out +++ b/src/test/regress/expected/plpgsql.out @@ -1761,10 +1761,10 @@ select f1(42) as int, f1(4.5) as num; select f1(point(3,4)); -- fail for lack of + operator ERROR: operator does not exist: point + integer -LINE 1: SELECT x + 1 - ^ +LINE 1: x + 1 + ^ HINT: No operator matches the given name and argument types. You might need to add explicit type casts. -QUERY: SELECT x + 1 +QUERY: x + 1 CONTEXT: PL/pgSQL function f1(anyelement) line 3 at RETURN drop function f1(x anyelement); create function f1(x anyelement) returns anyarray as $$ @@ -2361,7 +2361,7 @@ begin end $$ language plpgsql; select namedparmcursor_test7(); ERROR: division by zero -CONTEXT: SQL statement "SELECT 42/0 AS p1, 77 AS p2;" +CONTEXT: SQL expression "42/0 AS p1, 77 AS p2" PL/pgSQL function namedparmcursor_test7() line 6 at OPEN -- check that line comments work correctly within the argument list (there -- is some special handling of this case in the code: the newline after the @@ -2574,9 +2574,9 @@ end; $$ language plpgsql; -- blocks select excpt_test1(); ERROR: column "sqlstate" does not exist -LINE 1: SELECT sqlstate - ^ -QUERY: SELECT sqlstate +LINE 1: sqlstate + ^ +QUERY: sqlstate CONTEXT: PL/pgSQL function excpt_test1() line 3 at RAISE create function excpt_test2() returns void as $$ begin @@ -2589,9 +2589,9 @@ end; $$ language plpgsql; -- should fail select excpt_test2(); ERROR: column "sqlstate" does not exist -LINE 1: SELECT sqlstate - ^ -QUERY: SELECT sqlstate +LINE 1: sqlstate + ^ +QUERY: sqlstate CONTEXT: PL/pgSQL function excpt_test2() line 5 at RAISE create function excpt_test3() returns void as $$ begin @@ -4467,11 +4467,11 @@ end $$; select fail(); ERROR: division by zero -CONTEXT: SQL statement "SELECT 1/0" +CONTEXT: SQL expression "1/0" PL/pgSQL function fail() line 3 at RETURN select fail(); ERROR: division by zero -CONTEXT: SQL statement "SELECT 1/0" +CONTEXT: SQL expression "1/0" PL/pgSQL function fail() line 3 at RETURN drop function fail(); -- Test handling of string literals. @@ -4497,10 +4497,10 @@ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. select strtest(); NOTICE: foo\bar!baz WARNING: nonstandard use of \\ in a string literal -LINE 1: SELECT 'foo\\bar\041baz' - ^ +LINE 1: 'foo\\bar\041baz' + ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. -QUERY: SELECT 'foo\\bar\041baz' +QUERY: 'foo\\bar\041baz' strtest ------------- foo\bar!baz @@ -5621,9 +5621,9 @@ ALTER TABLE alter_table_under_transition_tables UPDATE alter_table_under_transition_tables SET id = id; ERROR: column "name" does not exist -LINE 1: SELECT (SELECT string_agg(id || '=' || name, ',') FROM d) - ^ -QUERY: SELECT (SELECT string_agg(id || '=' || name, ',') FROM d) +LINE 1: (SELECT string_agg(id || '=' || name, ',') FROM d) + ^ +QUERY: (SELECT string_agg(id || '=' || name, ',') FROM d) CONTEXT: PL/pgSQL function alter_table_under_transition_tables_upd_func() line 3 at RAISE -- -- Test multiple reference to a transition table diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c index c3ce480c8f..edea27697e 100644 --- a/src/backend/commands/functioncmds.c +++ b/src/backend/commands/functioncmds.c @@ -1628,6 +1628,7 @@ CreateCast(CreateCastStmt *stmt) case COERCION_ASSIGNMENT: castcontext = COERCION_CODE_ASSIGNMENT; break; + /* COERCION_PLPGSQL is intentionally not covered here */ case COERCION_EXPLICIT: castcontext = COERCION_CODE_EXPLICIT; break; diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index f529707458..eb0d9f51fb 100644 --- a/src/backend/executor/spi.c +++ b/src/backend/executor/spi.c @@ -2769,6 +2769,11 @@ _SPI_error_callback(void *arg) case RAW_PARSE_PLPGSQL_EXPR: errcontext("SQL expression \"%s\"", query); break; + case RAW_PARSE_PLPGSQL_ASSIGN1: + case RAW_PARSE_PLPGSQL_ASSIGN2: + case RAW_PARSE_PLPGSQL_ASSIGN3: + errcontext("PL/pgSQL assignment \"%s\"", query); + break; default: errcontext("SQL statement \"%s\"", query); break; diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 70f8b718e0..990daf1c98 100644 --- a/src/backend/nodes/copyfuncs.c +++ b/src/backend/nodes/copyfuncs.c @@ -3199,6 +3199,20 @@ _copySetOperationStmt(const SetOperationStmt *from) return newnode; } +static PLAssignStmt * +_copyPLAssignStmt(const PLAssignStmt *from) +{ + PLAssignStmt *newnode = makeNode(PLAssignStmt); + + COPY_STRING_FIELD(name); + COPY_NODE_FIELD(indirection); + COPY_SCALAR_FIELD(nnames); + COPY_NODE_FIELD(val); + COPY_LOCATION_FIELD(location); + + return newnode; +} + static AlterTableStmt * _copyAlterTableStmt(const AlterTableStmt *from) { @@ -5220,6 +5234,9 @@ copyObjectImpl(const void *from) case T_SetOperationStmt: retval = _copySetOperationStmt(from); break; + case T_PLAssignStmt: + retval = _copyPLAssignStmt(from); + break; case T_AlterTableStmt: retval = _copyAlterTableStmt(from); break; diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c index 541e0e6b48..7cbd31273d 100644 --- a/src/backend/nodes/equalfuncs.c +++ b/src/backend/nodes/equalfuncs.c @@ -1085,6 +1085,18 @@ _equalSetOperationStmt(const SetOperationStmt *a, const SetOperationStmt *b) return true; } +static bool +_equalPLAssignStmt(const PLAssignStmt *a, const PLAssignStmt *b) +{ + COMPARE_STRING_FIELD(name); + COMPARE_NODE_FIELD(indirection); + COMPARE_SCALAR_FIELD(nnames); + COMPARE_NODE_FIELD(val); + COMPARE_LOCATION_FIELD(location); + + return true; +} + static bool _equalAlterTableStmt(const AlterTableStmt *a, const AlterTableStmt *b) { @@ -3275,6 +3287,9 @@ equal(const void *a, const void *b) case T_SetOperationStmt: retval = _equalSetOperationStmt(a, b); break; + case T_PLAssignStmt: + retval = _equalPLAssignStmt(a, b); + break; case T_AlterTableStmt: retval = _equalAlterTableStmt(a, b); break; diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index 963f71e99d..f66d0e5fd5 100644 --- a/src/backend/nodes/nodeFuncs.c +++ b/src/backend/nodes/nodeFuncs.c @@ -3669,6 +3669,16 @@ raw_expression_tree_walker(Node *node, return true; } break; + case T_PLAssignStmt: + { + PLAssignStmt *stmt = (PLAssignStmt *) node; + + if (walker(stmt->indirection, context)) + return true; + if (walker(stmt->val, context)) + return true; + } + break; case T_A_Expr: { A_Expr *expr = (A_Expr *) node; diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index d78b16ed1d..4871702f02 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -2775,6 +2775,18 @@ _outSelectStmt(StringInfo str, const SelectStmt *node) WRITE_NODE_FIELD(rarg); } +static void +_outPLAssignStmt(StringInfo str, const PLAssignStmt *node) +{ + WRITE_NODE_TYPE("PLASSIGN"); + + WRITE_STRING_FIELD(name); + WRITE_NODE_FIELD(indirection); + WRITE_INT_FIELD(nnames); + WRITE_NODE_FIELD(val); + WRITE_LOCATION_FIELD(location); +} + static void _outFuncCall(StringInfo str, const FuncCall *node) { @@ -4211,6 +4223,9 @@ outNode(StringInfo str, const void *obj) case T_SelectStmt: _outSelectStmt(str, obj); break; + case T_PLAssignStmt: + _outPLAssignStmt(str, obj); + break; case T_ColumnDef: _outColumnDef(str, obj); break; diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index 084e00f73d..bf2824c00f 100644 --- a/src/backend/parser/analyze.c +++ b/src/backend/parser/analyze.c @@ -42,8 +42,10 @@ #include "parser/parse_param.h" #include "parser/parse_relation.h" #include "parser/parse_target.h" +#include "parser/parse_type.h" #include "parser/parsetree.h" #include "rewrite/rewriteManip.h" +#include "utils/builtins.h" #include "utils/rel.h" @@ -70,6 +72,8 @@ static Query *transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt); static List *transformReturningList(ParseState *pstate, List *returningList); static List *transformUpdateTargetList(ParseState *pstate, List *targetList); +static Query *transformPLAssignStmt(ParseState *pstate, + PLAssignStmt *stmt); static Query *transformDeclareCursorStmt(ParseState *pstate, DeclareCursorStmt *stmt); static Query *transformExplainStmt(ParseState *pstate, @@ -304,6 +308,11 @@ transformStmt(ParseState *pstate, Node *parseTree) } break; + case T_PLAssignStmt: + result = transformPLAssignStmt(pstate, + (PLAssignStmt *) parseTree); + break; + /* * Special cases */ @@ -367,6 +376,7 @@ analyze_requires_snapshot(RawStmt *parseTree) case T_DeleteStmt: case T_UpdateStmt: case T_SelectStmt: + case T_PLAssignStmt: result = true; break; @@ -2393,6 +2403,236 @@ transformReturningList(ParseState *pstate, List *returningList) } +/* + * transformPLAssignStmt - + * transform a PL/pgSQL assignment statement + * + * If there is no opt_indirection, the transformed statement looks like + * "SELECT a_expr ...", except the expression has been cast to the type of + * the target. With indirection, it's still a SELECT, but the expression will + * incorporate FieldStore and/or assignment SubscriptingRef nodes to compute a + * new value for a container-type variable represented by the target. The + * expression references the target as the container source. + */ +static Query * +transformPLAssignStmt(ParseState *pstate, PLAssignStmt *stmt) +{ + Query *qry = makeNode(Query); + ColumnRef *cref = makeNode(ColumnRef); + List *indirection = stmt->indirection; + int nnames = stmt->nnames; + SelectStmt *sstmt = stmt->val; + Node *target; + Oid targettype; + int32 targettypmod; + Oid targetcollation; + List *tlist; + TargetEntry *tle; + Oid type_id; + Node *qual; + ListCell *l; + + /* + * First, construct a ColumnRef for the target variable. If the target + * has more than one dotted name, we have to pull the extra names out of + * the indirection list. + */ + cref->fields = list_make1(makeString(stmt->name)); + cref->location = stmt->location; + if (nnames > 1) + { + /* avoid munging the raw parsetree */ + indirection = list_copy(indirection); + while (--nnames > 0 && indirection != NIL) + { + Node *ind = (Node *) linitial(indirection); + + if (!IsA(ind, String)) + elog(ERROR, "invalid name count in PLAssignStmt"); + cref->fields = lappend(cref->fields, ind); + indirection = list_delete_first(indirection); + } + } + + /* + * Transform the target reference. Typically we will get back a Param + * node, but there's no reason to be too picky about its type. + */ + target = transformExpr(pstate, (Node *) cref, + EXPR_KIND_UPDATE_TARGET); + targettype = exprType(target); + targettypmod = exprTypmod(target); + targetcollation = exprCollation(target); + + /* + * The rest mostly matches transformSelectStmt, except that we needn't + * consider WITH or DISTINCT, and we build a targetlist our own way. + */ + qry->commandType = CMD_SELECT; + pstate->p_is_insert = false; + + /* make FOR UPDATE/FOR SHARE info available to addRangeTableEntry */ + pstate->p_locking_clause = sstmt->lockingClause; + + /* make WINDOW info available for window functions, too */ + pstate->p_windowdefs = sstmt->windowClause; + + /* process the FROM clause */ + transformFromClause(pstate, sstmt->fromClause); + + /* initially transform the targetlist as if in SELECT */ + tlist = transformTargetList(pstate, sstmt->targetList, + EXPR_KIND_SELECT_TARGET); + + /* we should have exactly one targetlist item */ + if (list_length(tlist) != 1) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg_plural("assignment source returned %d column", + "assignment source returned %d columns", + list_length(tlist), + list_length(tlist)))); + + tle = linitial_node(TargetEntry, tlist); + + /* + * This next bit is similar to transformAssignedExpr; the key difference + * is we use COERCION_PLPGSQL not COERCION_ASSIGNMENT. + */ + type_id = exprType((Node *) tle->expr); + + pstate->p_expr_kind = EXPR_KIND_UPDATE_TARGET; + + if (indirection) + { + tle->expr = (Expr *) + transformAssignmentIndirection(pstate, + target, + stmt->name, + false, + targettype, + targettypmod, + targetcollation, + indirection, + list_head(indirection), + (Node *) tle->expr, + COERCION_PLPGSQL, + exprLocation(target)); + } + else if (targettype != type_id && + (targettype == RECORDOID || ISCOMPLEX(targettype)) && + (type_id == RECORDOID || ISCOMPLEX(type_id))) + { + /* + * Hack: do not let coerce_to_target_type() deal with inconsistent + * composite types. Just pass the expression result through as-is, + * and let the PL/pgSQL executor do the conversion its way. This is + * rather bogus, but it's needed for backwards compatibility. + */ + } + else + { + /* + * For normal non-qualified target column, do type checking and + * coercion. + */ + Node *orig_expr = (Node *) tle->expr; + + tle->expr = (Expr *) + coerce_to_target_type(pstate, + orig_expr, type_id, + targettype, targettypmod, + COERCION_PLPGSQL, + COERCE_IMPLICIT_CAST, + -1); + /* With COERCION_PLPGSQL, this error is probably unreachable */ + if (tle->expr == NULL) + ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), + errmsg("variable \"%s\" is of type %s" + " but expression is of type %s", + stmt->name, + format_type_be(targettype), + format_type_be(type_id)), + errhint("You will need to rewrite or cast the expression."), + parser_errposition(pstate, exprLocation(orig_expr)))); + } + + pstate->p_expr_kind = EXPR_KIND_NONE; + + qry->targetList = list_make1(tle); + + /* transform WHERE */ + qual = transformWhereClause(pstate, sstmt->whereClause, + EXPR_KIND_WHERE, "WHERE"); + + /* initial processing of HAVING clause is much like WHERE clause */ + qry->havingQual = transformWhereClause(pstate, sstmt->havingClause, + EXPR_KIND_HAVING, "HAVING"); + + /* + * Transform sorting/grouping stuff. Do ORDER BY first because both + * transformGroupClause and transformDistinctClause need the results. Note + * that these functions can also change the targetList, so it's passed to + * them by reference. + */ + qry->sortClause = transformSortClause(pstate, + sstmt->sortClause, + &qry->targetList, + EXPR_KIND_ORDER_BY, + false /* allow SQL92 rules */ ); + + qry->groupClause = transformGroupClause(pstate, + sstmt->groupClause, + &qry->groupingSets, + &qry->targetList, + qry->sortClause, + EXPR_KIND_GROUP_BY, + false /* allow SQL92 rules */ ); + + /* No DISTINCT clause */ + Assert(!sstmt->distinctClause); + qry->distinctClause = NIL; + qry->hasDistinctOn = false; + + /* transform LIMIT */ + qry->limitOffset = transformLimitClause(pstate, sstmt->limitOffset, + EXPR_KIND_OFFSET, "OFFSET", + sstmt->limitOption); + qry->limitCount = transformLimitClause(pstate, sstmt->limitCount, + EXPR_KIND_LIMIT, "LIMIT", + sstmt->limitOption); + qry->limitOption = sstmt->limitOption; + + /* transform window clauses after we have seen all window functions */ + qry->windowClause = transformWindowDefinitions(pstate, + pstate->p_windowdefs, + &qry->targetList); + + qry->rtable = pstate->p_rtable; + qry->jointree = makeFromExpr(pstate->p_joinlist, qual); + + qry->hasSubLinks = pstate->p_hasSubLinks; + qry->hasWindowFuncs = pstate->p_hasWindowFuncs; + qry->hasTargetSRFs = pstate->p_hasTargetSRFs; + qry->hasAggs = pstate->p_hasAggs; + + foreach(l, sstmt->lockingClause) + { + transformLockingClause(pstate, qry, + (LockingClause *) lfirst(l), false); + } + + assign_query_collations(pstate, qry); + + /* this must be done after collations, for reliable comparison of exprs */ + if (pstate->p_hasAggs || qry->groupClause || qry->groupingSets || qry->havingQual) + parseCheckAggregates(pstate, qry); + + return qry; +} + + /* * transformDeclareCursorStmt - * transform a DECLARE CURSOR Statement diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 4c58b46651..9101435f7b 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -294,7 +294,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); %type <node> select_no_parens select_with_parens select_clause simple_select values_clause - PLpgSQL_Expr + PLpgSQL_Expr PLAssignStmt %type <node> alter_column_default opclass_item opclass_drop alter_using %type <ival> add_drop opt_asc_desc opt_nulls_order @@ -536,7 +536,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); %type <str> ColId ColLabel BareColLabel %type <str> NonReservedWord NonReservedWord_or_Sconst %type <str> var_name type_function_name param_name -%type <str> createdb_opt_name +%type <str> createdb_opt_name plassign_target %type <node> var_value zone_value %type <rolespec> auth_ident RoleSpec opt_granted_by @@ -733,6 +733,9 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); */ %token MODE_TYPE_NAME %token MODE_PLPGSQL_EXPR +%token MODE_PLPGSQL_ASSIGN1 +%token MODE_PLPGSQL_ASSIGN2 +%token MODE_PLPGSQL_ASSIGN3 /* Precedence: lowest to highest */ @@ -815,6 +818,27 @@ stmtblock: stmtmulti pg_yyget_extra(yyscanner)->parsetree = list_make1(makeRawStmt($2, 0)); } + | MODE_PLPGSQL_ASSIGN1 PLAssignStmt + { + PLAssignStmt *n = (PLAssignStmt *) $2; + n->nnames = 1; + pg_yyget_extra(yyscanner)->parsetree = + list_make1(makeRawStmt((Node *) n, 0)); + } + | MODE_PLPGSQL_ASSIGN2 PLAssignStmt + { + PLAssignStmt *n = (PLAssignStmt *) $2; + n->nnames = 2; + pg_yyget_extra(yyscanner)->parsetree = + list_make1(makeRawStmt((Node *) n, 0)); + } + | MODE_PLPGSQL_ASSIGN3 PLAssignStmt + { + PLAssignStmt *n = (PLAssignStmt *) $2; + n->nnames = 3; + pg_yyget_extra(yyscanner)->parsetree = + list_make1(makeRawStmt((Node *) n, 0)); + } ; /* @@ -15069,6 +15093,31 @@ PLpgSQL_Expr: opt_target_list } ; +/* + * PL/pgSQL Assignment statement: name opt_indirection := PLpgSQL_Expr + */ + +PLAssignStmt: plassign_target opt_indirection plassign_equals PLpgSQL_Expr + { + PLAssignStmt *n = makeNode(PLAssignStmt); + + n->name = $1; + n->indirection = check_indirection($2, yyscanner); + /* nnames will be filled by calling production */ + n->val = (SelectStmt *) $4; + n->location = @1; + $$ = (Node *) n; + } + ; + +plassign_target: ColId { $$ = $1; } + | PARAM { $$ = psprintf("$%d", $1); } + ; + +plassign_equals: COLON_EQUALS + | '=' + ; + /* * Name classification hierarchy. diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c index da6c3ae4b5..462e2af74e 100644 --- a/src/backend/parser/parse_coerce.c +++ b/src/backend/parser/parse_coerce.c @@ -2815,6 +2815,14 @@ find_coercion_pathway(Oid targetTypeId, Oid sourceTypeId, } } + /* + * When parsing PL/pgSQL assignments, allow an I/O cast to be used + * whenever no normal coercion is available. + */ + if (result == COERCION_PATH_NONE && + ccontext == COERCION_PLPGSQL) + result = COERCION_PATH_COERCEVIAIO; + return result; } diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c index 3dda8e2847..8f2b0f18e0 100644 --- a/src/backend/parser/parse_target.c +++ b/src/backend/parser/parse_target.c @@ -34,17 +34,6 @@ static void markTargetListOrigin(ParseState *pstate, TargetEntry *tle, Var *var, int levelsup); -static Node *transformAssignmentIndirection(ParseState *pstate, - Node *basenode, - const char *targetName, - bool targetIsSubscripting, - Oid targetTypeId, - int32 targetTypMod, - Oid targetCollation, - List *indirection, - ListCell *indirection_cell, - Node *rhs, - int location); static Node *transformAssignmentSubscripts(ParseState *pstate, Node *basenode, const char *targetName, @@ -56,6 +45,7 @@ static Node *transformAssignmentSubscripts(ParseState *pstate, List *indirection, ListCell *next_indirection, Node *rhs, + CoercionContext ccontext, int location); static List *ExpandColumnRefStar(ParseState *pstate, ColumnRef *cref, bool make_target_entry); @@ -561,6 +551,7 @@ transformAssignedExpr(ParseState *pstate, indirection, list_head(indirection), (Node *) expr, + COERCION_ASSIGNMENT, location); } else @@ -642,15 +633,15 @@ updateTargetListEntry(ParseState *pstate, /* * Process indirection (field selection or subscripting) of the target - * column in INSERT/UPDATE. This routine recurses for multiple levels - * of indirection --- but note that several adjacent A_Indices nodes in - * the indirection list are treated as a single multidimensional subscript + * column in INSERT/UPDATE/assignment. This routine recurses for multiple + * levels of indirection --- but note that several adjacent A_Indices nodes + * in the indirection list are treated as a single multidimensional subscript * operation. * * In the initial call, basenode is a Var for the target column in UPDATE, - * or a null Const of the target's type in INSERT. In recursive calls, - * basenode is NULL, indicating that a substitute node should be consed up if - * needed. + * or a null Const of the target's type in INSERT, or a Param for the target + * variable in PL/pgSQL assignment. In recursive calls, basenode is NULL, + * indicating that a substitute node should be consed up if needed. * * targetName is the name of the field or subfield we're assigning to, and * targetIsSubscripting is true if we're subscripting it. These are just for @@ -667,12 +658,16 @@ updateTargetListEntry(ParseState *pstate, * rhs is the already-transformed value to be assigned; note it has not been * coerced to any particular type. * + * ccontext is the coercion level to use while coercing the rhs. For + * normal statements it'll be COERCION_ASSIGNMENT, but PL/pgSQL uses + * a special value. + * * location is the cursor error position for any errors. (Note: this points * to the head of the target clause, eg "foo" in "foo.bar[baz]". Later we * might want to decorate indirection cells with their own location info, * in which case the location argument could probably be dropped.) */ -static Node * +Node * transformAssignmentIndirection(ParseState *pstate, Node *basenode, const char *targetName, @@ -683,6 +678,7 @@ transformAssignmentIndirection(ParseState *pstate, List *indirection, ListCell *indirection_cell, Node *rhs, + CoercionContext ccontext, int location) { Node *result; @@ -757,6 +753,7 @@ transformAssignmentIndirection(ParseState *pstate, indirection, i, rhs, + ccontext, location); } @@ -807,6 +804,7 @@ transformAssignmentIndirection(ParseState *pstate, indirection, lnext(indirection, i), rhs, + ccontext, location); /* and build a FieldStore node */ @@ -845,6 +843,7 @@ transformAssignmentIndirection(ParseState *pstate, indirection, NULL, rhs, + ccontext, location); } @@ -853,7 +852,7 @@ transformAssignmentIndirection(ParseState *pstate, result = coerce_to_target_type(pstate, rhs, exprType(rhs), targetTypeId, targetTypMod, - COERCION_ASSIGNMENT, + ccontext, COERCE_IMPLICIT_CAST, -1); if (result == NULL) @@ -898,6 +897,7 @@ transformAssignmentSubscripts(ParseState *pstate, List *indirection, ListCell *next_indirection, Node *rhs, + CoercionContext ccontext, int location) { Node *result; @@ -949,6 +949,7 @@ transformAssignmentSubscripts(ParseState *pstate, indirection, next_indirection, rhs, + ccontext, location); /* @@ -969,7 +970,7 @@ transformAssignmentSubscripts(ParseState *pstate, result = coerce_to_target_type(pstate, result, resulttype, targetTypeId, targetTypMod, - COERCION_ASSIGNMENT, + ccontext, COERCE_IMPLICIT_CAST, -1); /* can fail if we had int2vector/oidvector, but not for true domains */ diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c index 71df8ef022..291706c593 100644 --- a/src/backend/parser/parser.c +++ b/src/backend/parser/parser.c @@ -58,7 +58,10 @@ raw_parser(const char *str, RawParseMode mode) static const int mode_token[] = { 0, /* RAW_PARSE_DEFAULT */ MODE_TYPE_NAME, /* RAW_PARSE_TYPE_NAME */ - MODE_PLPGSQL_EXPR /* RAW_PARSE_PLPGSQL_EXPR */ + MODE_PLPGSQL_EXPR, /* RAW_PARSE_PLPGSQL_EXPR */ + MODE_PLPGSQL_ASSIGN1, /* RAW_PARSE_PLPGSQL_ASSIGN1 */ + MODE_PLPGSQL_ASSIGN2, /* RAW_PARSE_PLPGSQL_ASSIGN2 */ + MODE_PLPGSQL_ASSIGN3 /* RAW_PARSE_PLPGSQL_ASSIGN3 */ }; yyextra.have_lookahead = true; diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c index a42ead7d69..fdcabe6a48 100644 --- a/src/backend/tcop/utility.c +++ b/src/backend/tcop/utility.c @@ -2313,6 +2313,10 @@ CreateCommandTag(Node *parsetree) tag = CMDTAG_SELECT; break; + case T_PLAssignStmt: + tag = CMDTAG_SELECT; + break; + /* utility statements --- same whether raw or cooked */ case T_TransactionStmt: { @@ -3181,6 +3185,10 @@ GetCommandLogLevel(Node *parsetree) lev = LOGSTMT_ALL; break; + case T_PLAssignStmt: + lev = LOGSTMT_ALL; + break; + /* utility statements --- same whether raw or cooked */ case T_TransactionStmt: lev = LOGSTMT_ALL; diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h index 3684f87a88..2d445d03db 100644 --- a/src/include/nodes/nodes.h +++ b/src/include/nodes/nodes.h @@ -314,6 +314,7 @@ typedef enum NodeTag T_DeleteStmt, T_UpdateStmt, T_SelectStmt, + T_PLAssignStmt, T_AlterTableStmt, T_AlterTableCmd, T_AlterDomainStmt, diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index 48a79a7657..8e09a457ab 100644 --- a/src/include/nodes/parsenodes.h +++ b/src/include/nodes/parsenodes.h @@ -1675,6 +1675,25 @@ typedef struct SetOperationStmt } SetOperationStmt; +/* ---------------------- + * PL/pgSQL Assignment Statement + * + * Like SelectStmt, this is transformed into a SELECT Query. + * However, the targetlist of the result looks more like an UPDATE. + * ---------------------- + */ +typedef struct PLAssignStmt +{ + NodeTag type; + + char *name; /* initial column name */ + List *indirection; /* subscripts and field names, if any */ + int nnames; /* number of names to use in ColumnRef */ + SelectStmt *val; /* the PL/pgSQL expression to assign */ + int location; /* name's token location, or -1 if unknown */ +} PLAssignStmt; + + /***************************************************************************** * Other Statements (no optimizations required) * diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h index dd85908fe2..71a58804dc 100644 --- a/src/include/nodes/primnodes.h +++ b/src/include/nodes/primnodes.h @@ -457,6 +457,7 @@ typedef enum CoercionContext { COERCION_IMPLICIT, /* coercion in context of expression */ COERCION_ASSIGNMENT, /* coercion in context of assignment */ + COERCION_PLPGSQL, /* if no assignment cast, use CoerceViaIO */ COERCION_EXPLICIT /* explicit cast operation */ } CoercionContext; diff --git a/src/include/parser/parse_target.h b/src/include/parser/parse_target.h index 7039df29cb..5ce8c3666e 100644 --- a/src/include/parser/parse_target.h +++ b/src/include/parser/parse_target.h @@ -36,6 +36,18 @@ extern void updateTargetListEntry(ParseState *pstate, TargetEntry *tle, char *colname, int attrno, List *indirection, int location); +extern Node *transformAssignmentIndirection(ParseState *pstate, + Node *basenode, + const char *targetName, + bool targetIsSubscripting, + Oid targetTypeId, + int32 targetTypMod, + Oid targetCollation, + List *indirection, + ListCell *indirection_cell, + Node *rhs, + CoercionContext ccontext, + int location); extern List *checkInsertTargets(ParseState *pstate, List *cols, List **attrnos); extern TupleDesc expandRecordVariable(ParseState *pstate, Var *var, diff --git a/src/include/parser/parser.h b/src/include/parser/parser.h index 3b7dab17ef..ac89f2b93f 100644 --- a/src/include/parser/parser.h +++ b/src/include/parser/parser.h @@ -29,12 +29,19 @@ * * RAW_PARSE_PLPGSQL_EXPR: parse a PL/pgSQL expression, and return * a one-element List containing a RawStmt node. + * + * RAW_PARSE_PLPGSQL_ASSIGNn: parse a PL/pgSQL assignment statement, + * and return a one-element List containing a RawStmt node. "n" + * gives the number of dotted names comprising the target ColumnRef. */ typedef enum { RAW_PARSE_DEFAULT = 0, RAW_PARSE_TYPE_NAME, - RAW_PARSE_PLPGSQL_EXPR + RAW_PARSE_PLPGSQL_EXPR, + RAW_PARSE_PLPGSQL_ASSIGN1, + RAW_PARSE_PLPGSQL_ASSIGN2, + RAW_PARSE_PLPGSQL_ASSIGN3 } RawParseMode; /* Values for the backslash_quote GUC */ diff --git a/src/interfaces/ecpg/preproc/parse.pl b/src/interfaces/ecpg/preproc/parse.pl index 2b254c2b77..cea6dd6517 100644 --- a/src/interfaces/ecpg/preproc/parse.pl +++ b/src/interfaces/ecpg/preproc/parse.pl @@ -71,7 +71,10 @@ my %replace_types = ( 'type_function_name' => 'ignore', 'ColLabel' => 'ignore', 'Sconst' => 'ignore', - 'PLpgSQL_Expr' => 'ignore',); + 'PLpgSQL_Expr' => 'ignore', + 'PLAssignStmt' => 'ignore', + 'plassign_target' => 'ignore', + 'plassign_equals' => 'ignore',); # these replace_line commands excise certain keywords from the core keyword # lists. Be sure to account for these in ColLabel and related productions. diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index f56dcd0e79..cb5c7f9fea 100644 --- a/src/pl/plpgsql/src/pl_exec.c +++ b/src/pl/plpgsql/src/pl_exec.c @@ -8008,10 +8008,14 @@ get_cast_hashentry(PLpgSQL_execstate *estate, placeholder->collation = get_typcollation(srctype); /* - * Apply coercion. We use ASSIGNMENT coercion because that's the - * closest match to plpgsql's historical behavior; in particular, - * EXPLICIT coercion would allow silent truncation to a destination - * varchar/bpchar's length, which we do not want. + * Apply coercion. We use the special coercion context + * COERCION_PLPGSQL to match plpgsql's historical behavior, namely + * that any cast not available at ASSIGNMENT level will be implemented + * as an I/O coercion. (It's somewhat dubious that we prefer I/O + * coercion over cast pathways that exist at EXPLICIT level. Changing + * that would cause assorted minor behavioral differences though, and + * a user who wants the explicit-cast behavior can always write an + * explicit cast.) * * If source type is UNKNOWN, coerce_to_target_type will fail (it only * expects to see that for Const input nodes), so don't call it; we'll @@ -8024,7 +8028,7 @@ get_cast_hashentry(PLpgSQL_execstate *estate, cast_expr = coerce_to_target_type(NULL, (Node *) placeholder, srctype, dsttype, dsttypmod, - COERCION_ASSIGNMENT, + COERCION_PLPGSQL, COERCE_IMPLICIT_CAST, -1); @@ -8032,7 +8036,8 @@ get_cast_hashentry(PLpgSQL_execstate *estate, * If there's no cast path according to the parser, fall back to using * an I/O coercion; this is semantically dubious but matches plpgsql's * historical behavior. We would need something of the sort for - * UNKNOWN literals in any case. + * UNKNOWN literals in any case. (This is probably now only reachable + * in the case where srctype is UNKNOWN/RECORD.) */ if (cast_expr == NULL) { @@ -8341,7 +8346,8 @@ exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno) return; /* - * Top level of expression must be a simple FuncExpr or OpExpr. + * Top level of expression must be a simple FuncExpr, OpExpr, or + * SubscriptingRef. */ if (IsA(expr->expr_simple_expr, FuncExpr)) { @@ -8357,6 +8363,33 @@ exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno) funcid = opexpr->opfuncid; fargs = opexpr->args; } + else if (IsA(expr->expr_simple_expr, SubscriptingRef)) + { + SubscriptingRef *sbsref = (SubscriptingRef *) expr->expr_simple_expr; + + /* We only trust standard varlena arrays to be safe */ + if (get_typsubscript(sbsref->refcontainertype, NULL) != + F_ARRAY_SUBSCRIPT_HANDLER) + return; + + /* refexpr can be a simple Param, otherwise must not contain target */ + if (!(sbsref->refexpr && IsA(sbsref->refexpr, Param)) && + contains_target_param((Node *) sbsref->refexpr, &target_dno)) + return; + + /* the other subexpressions must not contain target */ + if (contains_target_param((Node *) sbsref->refupperindexpr, + &target_dno) || + contains_target_param((Node *) sbsref->reflowerindexpr, + &target_dno) || + contains_target_param((Node *) sbsref->refassgnexpr, + &target_dno)) + return; + + /* OK, we can pass target as a read-write parameter */ + expr->rwparam = target_dno; + return; + } else return; diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y index 3b36220d73..051544a3b4 100644 --- a/src/pl/plpgsql/src/pl_gram.y +++ b/src/pl/plpgsql/src/pl_gram.y @@ -973,16 +973,40 @@ stmt_call : K_CALL } ; -stmt_assign : assign_var assign_operator expr_until_semi +stmt_assign : T_DATUM { PLpgSQL_stmt_assign *new; + RawParseMode pmode; + /* see how many names identify the datum */ + switch ($1.ident ? 1 : list_length($1.idents)) + { + case 1: + pmode = RAW_PARSE_PLPGSQL_ASSIGN1; + break; + case 2: + pmode = RAW_PARSE_PLPGSQL_ASSIGN2; + break; + case 3: + pmode = RAW_PARSE_PLPGSQL_ASSIGN3; + break; + default: + elog(ERROR, "unexpected number of names"); + pmode = 0; /* keep compiler quiet */ + } + + check_assignable($1.datum, @1); new = palloc0(sizeof(PLpgSQL_stmt_assign)); new->cmd_type = PLPGSQL_STMT_ASSIGN; new->lineno = plpgsql_location_to_lineno(@1); new->stmtid = ++plpgsql_curr_compile->nstatements; - new->varno = $1->dno; - new->expr = $3; + new->varno = $1.datum->dno; + /* Push back the head name to include it in the stmt */ + plpgsql_push_back_token(T_DATUM); + new->expr = read_sql_construct(';', 0, 0, ";", + pmode, + false, true, true, + NULL, NULL); $$ = (PLpgSQL_stmt *)new; } diff --git a/contrib/hstore/expected/hstore.out b/contrib/hstore/expected/hstore.out index fdcc3920ce..64a3272b9c 100644 --- a/contrib/hstore/expected/hstore.out +++ b/contrib/hstore/expected/hstore.out @@ -1583,6 +1583,10 @@ select f2 from test_json_agg; "d"=>NULL, "x"=>"xyzzy" (3 rows) +-- Test subscripting in plpgsql +do $$ declare h hstore; +begin h['a'] := 'b'; raise notice 'h = %, h[a] = %', h, h['a']; end $$; +NOTICE: h = "a"=>"b", h[a] = b -- Check the hstore_hash() and hstore_hash_extended() function explicitly. SELECT v as value, hstore_hash(v)::bit(32) as standard, hstore_hash_extended(v, 0)::bit(32) as extended0, diff --git a/contrib/hstore/sql/hstore.sql b/contrib/hstore/sql/hstore.sql index 8d96e30403..a59db66b0a 100644 --- a/contrib/hstore/sql/hstore.sql +++ b/contrib/hstore/sql/hstore.sql @@ -372,6 +372,10 @@ select f2['d':'e'] from test_json_agg; -- error update test_json_agg set f2['d'] = f2['e'], f2['x'] = 'xyzzy'; select f2 from test_json_agg; +-- Test subscripting in plpgsql +do $$ declare h hstore; +begin h['a'] := 'b'; raise notice 'h = %, h[a] = %', h, h['a']; end $$; + -- Check the hstore_hash() and hstore_hash_extended() function explicitly. SELECT v as value, hstore_hash(v)::bit(32) as standard, hstore_hash_extended(v, 0)::bit(32) as extended0, diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml index 9ec168b0c4..32c466eaa5 100644 --- a/doc/src/sgml/plpgsql.sgml +++ b/doc/src/sgml/plpgsql.sgml @@ -946,8 +946,8 @@ PREPARE <replaceable>statement_name</replaceable>(integer, integer) AS SELECT $1 database engine. The expression must yield a single value (possibly a row value, if the variable is a row or record variable). The target variable can be a simple variable (optionally qualified with a block - name), a field of a row or record variable, or an element of an array - that is a simple variable or field. Equal (<literal>=</literal>) can be + name), a field of a row or record target, or an element or slice of + an array target. Equal (<literal>=</literal>) can be used instead of PL/SQL-compliant <literal>:=</literal>. </para> @@ -968,8 +968,25 @@ PREPARE <replaceable>statement_name</replaceable>(integer, integer) AS SELECT $1 <programlisting> tax := subtotal * 0.06; my_record.user_id := 20; +my_array[j] := 20; +my_array[1:3] := array[1,2,3]; +complex_array[n].realpart = 12.3; </programlisting> </para> + + <para> + It's useful to know that what follows the assignment operator is + essentially treated as a <literal>SELECT</literal> command; as long + as it returns a single row and column, it will work. Thus for example + one can write something like +<programlisting> +total_sales := sum(quantity) from sales; +</programlisting> + This provides an effect similar to the single-row <literal>SELECT + ... INTO</literal> syntax described in + <xref linkend="plpgsql-statements-sql-onerow"/>. However, that syntax + is more portable. + </para> </sect2> <sect2 id="plpgsql-statements-sql-noresult"> diff --git a/src/pl/plpgsql/src/Makefile b/src/pl/plpgsql/src/Makefile index 193df8a010..9946abbc1d 100644 --- a/src/pl/plpgsql/src/Makefile +++ b/src/pl/plpgsql/src/Makefile @@ -32,7 +32,7 @@ DATA = plpgsql.control plpgsql--1.0.sql REGRESS_OPTS = --dbname=$(PL_TESTDB) -REGRESS = plpgsql_call plpgsql_control plpgsql_copy plpgsql_domain \ +REGRESS = plpgsql_array plpgsql_call plpgsql_control plpgsql_copy plpgsql_domain \ plpgsql_record plpgsql_cache plpgsql_simple plpgsql_transaction \ plpgsql_trap plpgsql_trigger plpgsql_varprops diff --git a/src/pl/plpgsql/src/expected/plpgsql_array.out b/src/pl/plpgsql/src/expected/plpgsql_array.out new file mode 100644 index 0000000000..5f28b4f685 --- /dev/null +++ b/src/pl/plpgsql/src/expected/plpgsql_array.out @@ -0,0 +1,94 @@ +-- +-- Tests for PL/pgSQL handling of array variables +-- +-- We also check arrays of composites here, so this has some overlap +-- with the plpgsql_record tests. +-- +create type complex as (r float8, i float8); +create type quadarray as (c1 complex[], c2 complex); +do $$ declare a int[]; +begin a := array[1,2]; a[3] := 4; raise notice 'a = %', a; end$$; +NOTICE: a = {1,2,4} +do $$ declare a int[]; +begin a[3] := 4; raise notice 'a = %', a; end$$; +NOTICE: a = [3:3]={4} +do $$ declare a int[]; +begin a[1][4] := 4; raise notice 'a = %', a; end$$; +NOTICE: a = [1:1][4:4]={{4}} +do $$ declare a int[]; +begin a[1] := 23::text; raise notice 'a = %', a; end$$; -- lax typing +NOTICE: a = {23} +do $$ declare a int[]; +begin a := array[1,2]; a[2:3] := array[3,4]; raise notice 'a = %', a; end$$; +NOTICE: a = {1,3,4} +do $$ declare a int[]; +begin a := array[1,2]; a[2] := a[2] + 1; raise notice 'a = %', a; end$$; +NOTICE: a = {1,3} +do $$ declare a int[]; +begin a[1:2] := array[3,4]; raise notice 'a = %', a; end$$; +NOTICE: a = {3,4} +do $$ declare a int[]; +begin a[1:2] := 4; raise notice 'a = %', a; end$$; -- error +ERROR: malformed array literal: "4" +DETAIL: Array value must start with "{" or dimension information. +CONTEXT: PL/pgSQL function inline_code_block line 2 at assignment +do $$ declare a complex[]; +begin a[1] := (1,2); a[1].i := 11; raise notice 'a = %', a; end$$; +NOTICE: a = {"(1,11)"} +do $$ declare a complex[]; +begin a[1].i := 11; raise notice 'a = %, a[1].i = %', a, a[1].i; end$$; +NOTICE: a = {"(,11)"}, a[1].i = 11 +-- perhaps this ought to work, but for now it doesn't: +do $$ declare a complex[]; +begin a[1:2].i := array[11,12]; raise notice 'a = %', a; end$$; +ERROR: cannot assign to field "i" of column "a" because its type complex[] is not a composite type +LINE 1: a[1:2].i := array[11,12] + ^ +QUERY: a[1:2].i := array[11,12] +CONTEXT: PL/pgSQL function inline_code_block line 2 at assignment +do $$ declare a quadarray; +begin a.c1[1].i := 11; raise notice 'a = %, a.c1[1].i = %', a, a.c1[1].i; end$$; +NOTICE: a = ("{""(,11)""}",), a.c1[1].i = 11 +do $$ declare a int[]; +begin a := array_agg(x) from (values(1),(2),(3)) v(x); raise notice 'a = %', a; end$$; +NOTICE: a = {1,2,3} +create temp table onecol as select array[1,2] as f1; +do $$ declare a int[]; +begin a := f1 from onecol; raise notice 'a = %', a; end$$; +NOTICE: a = {1,2} +do $$ declare a int[]; +begin a := * from onecol for update; raise notice 'a = %', a; end$$; +NOTICE: a = {1,2} +-- error cases: +do $$ declare a int[]; +begin a := from onecol; raise notice 'a = %', a; end$$; +ERROR: assignment source returned 0 columns +CONTEXT: PL/pgSQL assignment "a := from onecol" +PL/pgSQL function inline_code_block line 2 at assignment +do $$ declare a int[]; +begin a := f1, f1 from onecol; raise notice 'a = %', a; end$$; +ERROR: assignment source returned 2 columns +CONTEXT: PL/pgSQL assignment "a := f1, f1 from onecol" +PL/pgSQL function inline_code_block line 2 at assignment +insert into onecol values(array[11]); +do $$ declare a int[]; +begin a := f1 from onecol; raise notice 'a = %', a; end$$; +ERROR: query "a := f1 from onecol" returned more than one row +CONTEXT: PL/pgSQL function inline_code_block line 2 at assignment +do $$ declare a int[]; +begin a := f1 from onecol limit 1; raise notice 'a = %', a; end$$; +NOTICE: a = {1,2} +do $$ declare a real; +begin a[1] := 2; raise notice 'a = %', a; end$$; +ERROR: cannot subscript type real because it does not support subscripting +LINE 1: a[1] := 2 + ^ +QUERY: a[1] := 2 +CONTEXT: PL/pgSQL function inline_code_block line 2 at assignment +do $$ declare a complex; +begin a.r[1] := 2; raise notice 'a = %', a; end$$; +ERROR: cannot subscript type double precision because it does not support subscripting +LINE 1: a.r[1] := 2 + ^ +QUERY: a.r[1] := 2 +CONTEXT: PL/pgSQL function inline_code_block line 2 at assignment diff --git a/src/pl/plpgsql/src/expected/plpgsql_record.out b/src/pl/plpgsql/src/expected/plpgsql_record.out index 52207e9b10..6e835c0751 100644 --- a/src/pl/plpgsql/src/expected/plpgsql_record.out +++ b/src/pl/plpgsql/src/expected/plpgsql_record.out @@ -3,6 +3,7 @@ -- create type two_int4s as (f1 int4, f2 int4); create type two_int8s as (q1 int8, q2 int8); +create type nested_int8s as (c1 two_int8s, c2 two_int8s); -- base-case return of a composite type create function retc(int) returns two_int8s language plpgsql as $$ begin return row($1,1)::two_int8s; end $$; @@ -82,6 +83,88 @@ begin end$$; NOTICE: c4 = (1,2) NOTICE: c8 = (1,2) +do $$ declare c two_int8s; d nested_int8s; +begin + c := row(1,2); + d := row(c, row(c.q1, c.q2+1)); + raise notice 'c = %, d = %', c, d; + c.q1 := 10; + d.c1 := row(11,12); + d.c2.q2 := 42; + raise notice 'c = %, d = %', c, d; + raise notice 'c.q1 = %, d.c2 = %', c.q1, d.c2; + raise notice '(d).c2.q2 = %', (d).c2.q2; -- doesn't work without parens + raise notice '(d.c2).q2 = %', (d.c2).q2; -- doesn't work without parens +end$$; +NOTICE: c = (1,2), d = ("(1,2)","(1,3)") +NOTICE: c = (10,2), d = ("(11,12)","(1,42)") +NOTICE: c.q1 = 10, d.c2 = (1,42) +NOTICE: (d).c2.q2 = 42 +NOTICE: (d.c2).q2 = 42 +-- block-qualified naming +do $$ <<b>> declare c two_int8s; d nested_int8s; +begin + b.c := row(1,2); + b.d := row(b.c, row(b.c.q1, b.c.q2+1)); + raise notice 'b.c = %, b.d = %', b.c, b.d; + b.c.q1 := 10; + b.d.c1 := row(11,12); + b.d.c2.q2 := 42; + raise notice 'b.c = %, b.d = %', b.c, b.d; + raise notice 'b.c.q1 = %, b.d.c2 = %', b.c.q1, b.d.c2; + raise notice '(b.d).c2.q2 = %', (b.d).c2.q2; -- doesn't work without parens + raise notice '(b.d.c2).q2 = %', (b.d.c2).q2; -- doesn't work without parens +end$$; +NOTICE: b.c = (1,2), b.d = ("(1,2)","(1,3)") +NOTICE: b.c = (10,2), b.d = ("(11,12)","(1,42)") +NOTICE: b.c.q1 = 10, b.d.c2 = (1,42) +NOTICE: (b.d).c2.q2 = 42 +NOTICE: (b.d.c2).q2 = 42 +-- error cases +do $$ declare c two_int8s; begin c.x = 1; end $$; +ERROR: record "c" has no field "x" +CONTEXT: PL/pgSQL assignment "c.x = 1" +PL/pgSQL function inline_code_block line 1 at assignment +do $$ declare c nested_int8s; begin c.x = 1; end $$; +ERROR: record "c" has no field "x" +CONTEXT: PL/pgSQL assignment "c.x = 1" +PL/pgSQL function inline_code_block line 1 at assignment +do $$ declare c nested_int8s; begin c.x.q1 = 1; end $$; +ERROR: record "c" has no field "x" +CONTEXT: PL/pgSQL assignment "c.x.q1 = 1" +PL/pgSQL function inline_code_block line 1 at assignment +do $$ declare c nested_int8s; begin c.c2.x = 1; end $$; +ERROR: cannot assign to field "x" of column "c" because there is no such column in data type two_int8s +LINE 1: c.c2.x = 1 + ^ +QUERY: c.c2.x = 1 +CONTEXT: PL/pgSQL function inline_code_block line 1 at assignment +do $$ declare c nested_int8s; begin d.c2.x = 1; end $$; +ERROR: "d.c2.x" is not a known variable +LINE 1: do $$ declare c nested_int8s; begin d.c2.x = 1; end $$; + ^ +do $$ <<b>> declare c two_int8s; begin b.c.x = 1; end $$; +ERROR: record "c" has no field "x" +CONTEXT: PL/pgSQL assignment "b.c.x = 1" +PL/pgSQL function inline_code_block line 1 at assignment +do $$ <<b>> declare c nested_int8s; begin b.c.x = 1; end $$; +ERROR: record "c" has no field "x" +CONTEXT: PL/pgSQL assignment "b.c.x = 1" +PL/pgSQL function inline_code_block line 1 at assignment +do $$ <<b>> declare c nested_int8s; begin b.c.x.q1 = 1; end $$; +ERROR: record "c" has no field "x" +CONTEXT: PL/pgSQL assignment "b.c.x.q1 = 1" +PL/pgSQL function inline_code_block line 1 at assignment +do $$ <<b>> declare c nested_int8s; begin b.c.c2.x = 1; end $$; +ERROR: cannot assign to field "x" of column "b" because there is no such column in data type two_int8s +LINE 1: b.c.c2.x = 1 + ^ +QUERY: b.c.c2.x = 1 +CONTEXT: PL/pgSQL function inline_code_block line 1 at assignment +do $$ <<b>> declare c nested_int8s; begin b.d.c2.x = 1; end $$; +ERROR: "b.d.c2" is not a known variable +LINE 1: do $$ <<b>> declare c nested_int8s; begin b.d.c2.x = 1; end ... + ^ -- check passing composite result to another function create function getq1(two_int8s) returns int8 language plpgsql as $$ declare r two_int8s; begin r := $1; return r.q1; end $$; diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c index b610b28d70..0225f5911d 100644 --- a/src/pl/plpgsql/src/pl_comp.c +++ b/src/pl/plpgsql/src/pl_comp.c @@ -1456,7 +1456,8 @@ plpgsql_parse_dblword(char *word1, char *word2, /* * We should do nothing in DECLARE sections. In SQL expressions, we * really only need to make sure that RECFIELD datums are created when - * needed. + * needed. In all the cases handled by this function, returning a T_DATUM + * with a two-word idents string is the right thing. */ if (plpgsql_IdentifierLookup != IDENTIFIER_LOOKUP_DECLARE) { @@ -1530,40 +1531,53 @@ plpgsql_parse_tripword(char *word1, char *word2, char *word3, List *idents; int nnames; - idents = list_make3(makeString(word1), - makeString(word2), - makeString(word3)); - /* - * We should do nothing in DECLARE sections. In SQL expressions, we - * really only need to make sure that RECFIELD datums are created when - * needed. + * We should do nothing in DECLARE sections. In SQL expressions, we need + * to make sure that RECFIELD datums are created when needed, and we need + * to be careful about how many names are reported as belonging to the + * T_DATUM: the third word could be a sub-field reference, which we don't + * care about here. */ if (plpgsql_IdentifierLookup != IDENTIFIER_LOOKUP_DECLARE) { /* - * Do a lookup in the current namespace stack. Must find a qualified + * Do a lookup in the current namespace stack. Must find a record * reference, else ignore. */ ns = plpgsql_ns_lookup(plpgsql_ns_top(), false, word1, word2, word3, &nnames); - if (ns != NULL && nnames == 2) + if (ns != NULL) { switch (ns->itemtype) { case PLPGSQL_NSTYPE_REC: { - /* - * words 1/2 are a record name, so third word could be - * a field in this record. - */ PLpgSQL_rec *rec; PLpgSQL_recfield *new; rec = (PLpgSQL_rec *) (plpgsql_Datums[ns->itemno]); - new = plpgsql_build_recfield(rec, word3); - + if (nnames == 1) + { + /* + * First word is a record name, so second word + * could be a field in this record (and the third, + * a sub-field). We build a RECFIELD datum + * whether it is or not --- any error will be + * detected later. + */ + new = plpgsql_build_recfield(rec, word2); + idents = list_make2(makeString(word1), + makeString(word2)); + } + else + { + /* Block-qualified reference to record variable. */ + new = plpgsql_build_recfield(rec, word3); + idents = list_make3(makeString(word1), + makeString(word2), + makeString(word3)); + } wdatum->datum = (PLpgSQL_datum *) new; wdatum->ident = NULL; wdatum->quoted = false; /* not used */ @@ -1578,6 +1592,9 @@ plpgsql_parse_tripword(char *word1, char *word2, char *word3, } /* Nothing found */ + idents = list_make3(makeString(word1), + makeString(word2), + makeString(word3)); cword->idents = idents; return false; } diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y index 051544a3b4..dece013e23 100644 --- a/src/pl/plpgsql/src/pl_gram.y +++ b/src/pl/plpgsql/src/pl_gram.y @@ -2004,7 +2004,7 @@ loop_body : proc_sect K_END K_LOOP opt_label ';' * variable. (The composite case is probably a syntax error, but we'll let * the core parser decide that.) Normally, we should assume that such a * word is a SQL statement keyword that isn't also a plpgsql keyword. - * However, if the next token is assignment or '[', it can't be a valid + * However, if the next token is assignment or '[' or '.', it can't be a valid * SQL statement, and what we're probably looking at is an intended variable * assignment. Give an appropriate complaint for that, instead of letting * the core parser throw an unhelpful "syntax error". @@ -2023,7 +2023,8 @@ stmt_execsql : K_IMPORT tok = yylex(); plpgsql_push_back_token(tok); - if (tok == '=' || tok == COLON_EQUALS || tok == '[') + if (tok == '=' || tok == COLON_EQUALS || + tok == '[' || tok == '.') word_is_not_variable(&($1), @1); $$ = make_execsql_stmt(T_WORD, @1); } @@ -2033,7 +2034,8 @@ stmt_execsql : K_IMPORT tok = yylex(); plpgsql_push_back_token(tok); - if (tok == '=' || tok == COLON_EQUALS || tok == '[') + if (tok == '=' || tok == COLON_EQUALS || + tok == '[' || tok == '.') cword_is_not_variable(&($1), @1); $$ = make_execsql_stmt(T_CWORD, @1); } diff --git a/src/pl/plpgsql/src/sql/plpgsql_array.sql b/src/pl/plpgsql/src/sql/plpgsql_array.sql new file mode 100644 index 0000000000..4c3f26be10 --- /dev/null +++ b/src/pl/plpgsql/src/sql/plpgsql_array.sql @@ -0,0 +1,79 @@ +-- +-- Tests for PL/pgSQL handling of array variables +-- +-- We also check arrays of composites here, so this has some overlap +-- with the plpgsql_record tests. +-- + +create type complex as (r float8, i float8); +create type quadarray as (c1 complex[], c2 complex); + +do $$ declare a int[]; +begin a := array[1,2]; a[3] := 4; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a[3] := 4; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a[1][4] := 4; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a[1] := 23::text; raise notice 'a = %', a; end$$; -- lax typing + +do $$ declare a int[]; +begin a := array[1,2]; a[2:3] := array[3,4]; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a := array[1,2]; a[2] := a[2] + 1; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a[1:2] := array[3,4]; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a[1:2] := 4; raise notice 'a = %', a; end$$; -- error + +do $$ declare a complex[]; +begin a[1] := (1,2); a[1].i := 11; raise notice 'a = %', a; end$$; + +do $$ declare a complex[]; +begin a[1].i := 11; raise notice 'a = %, a[1].i = %', a, a[1].i; end$$; + +-- perhaps this ought to work, but for now it doesn't: +do $$ declare a complex[]; +begin a[1:2].i := array[11,12]; raise notice 'a = %', a; end$$; + +do $$ declare a quadarray; +begin a.c1[1].i := 11; raise notice 'a = %, a.c1[1].i = %', a, a.c1[1].i; end$$; + +do $$ declare a int[]; +begin a := array_agg(x) from (values(1),(2),(3)) v(x); raise notice 'a = %', a; end$$; + +create temp table onecol as select array[1,2] as f1; + +do $$ declare a int[]; +begin a := f1 from onecol; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a := * from onecol for update; raise notice 'a = %', a; end$$; + +-- error cases: + +do $$ declare a int[]; +begin a := from onecol; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a := f1, f1 from onecol; raise notice 'a = %', a; end$$; + +insert into onecol values(array[11]); + +do $$ declare a int[]; +begin a := f1 from onecol; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a := f1 from onecol limit 1; raise notice 'a = %', a; end$$; + +do $$ declare a real; +begin a[1] := 2; raise notice 'a = %', a; end$$; + +do $$ declare a complex; +begin a.r[1] := 2; raise notice 'a = %', a; end$$; diff --git a/src/pl/plpgsql/src/sql/plpgsql_record.sql b/src/pl/plpgsql/src/sql/plpgsql_record.sql index 128846e610..be10f00b1e 100644 --- a/src/pl/plpgsql/src/sql/plpgsql_record.sql +++ b/src/pl/plpgsql/src/sql/plpgsql_record.sql @@ -4,6 +4,7 @@ create type two_int4s as (f1 int4, f2 int4); create type two_int8s as (q1 int8, q2 int8); +create type nested_int8s as (c1 two_int8s, c2 two_int8s); -- base-case return of a composite type create function retc(int) returns two_int8s language plpgsql as @@ -59,6 +60,47 @@ begin raise notice 'c8 = %', c8; end$$; +do $$ declare c two_int8s; d nested_int8s; +begin + c := row(1,2); + d := row(c, row(c.q1, c.q2+1)); + raise notice 'c = %, d = %', c, d; + c.q1 := 10; + d.c1 := row(11,12); + d.c2.q2 := 42; + raise notice 'c = %, d = %', c, d; + raise notice 'c.q1 = %, d.c2 = %', c.q1, d.c2; + raise notice '(d).c2.q2 = %', (d).c2.q2; -- doesn't work without parens + raise notice '(d.c2).q2 = %', (d.c2).q2; -- doesn't work without parens +end$$; + +-- block-qualified naming +do $$ <<b>> declare c two_int8s; d nested_int8s; +begin + b.c := row(1,2); + b.d := row(b.c, row(b.c.q1, b.c.q2+1)); + raise notice 'b.c = %, b.d = %', b.c, b.d; + b.c.q1 := 10; + b.d.c1 := row(11,12); + b.d.c2.q2 := 42; + raise notice 'b.c = %, b.d = %', b.c, b.d; + raise notice 'b.c.q1 = %, b.d.c2 = %', b.c.q1, b.d.c2; + raise notice '(b.d).c2.q2 = %', (b.d).c2.q2; -- doesn't work without parens + raise notice '(b.d.c2).q2 = %', (b.d.c2).q2; -- doesn't work without parens +end$$; + +-- error cases +do $$ declare c two_int8s; begin c.x = 1; end $$; +do $$ declare c nested_int8s; begin c.x = 1; end $$; +do $$ declare c nested_int8s; begin c.x.q1 = 1; end $$; +do $$ declare c nested_int8s; begin c.c2.x = 1; end $$; +do $$ declare c nested_int8s; begin d.c2.x = 1; end $$; +do $$ <<b>> declare c two_int8s; begin b.c.x = 1; end $$; +do $$ <<b>> declare c nested_int8s; begin b.c.x = 1; end $$; +do $$ <<b>> declare c nested_int8s; begin b.c.x.q1 = 1; end $$; +do $$ <<b>> declare c nested_int8s; begin b.c.c2.x = 1; end $$; +do $$ <<b>> declare c nested_int8s; begin b.d.c2.x = 1; end $$; + -- check passing composite result to another function create function getq1(two_int8s) returns int8 language plpgsql as $$ declare r two_int8s; begin r := $1; return r.q1; end $$; diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index cb5c7f9fea..5c1db1dcfb 100644 --- a/src/pl/plpgsql/src/pl_exec.c +++ b/src/pl/plpgsql/src/pl_exec.c @@ -1311,12 +1311,11 @@ copy_plpgsql_datums(PLpgSQL_execstate *estate, case PLPGSQL_DTYPE_ROW: case PLPGSQL_DTYPE_RECFIELD: - case PLPGSQL_DTYPE_ARRAYELEM: /* * These datum records are read-only at runtime, so no need to - * copy them (well, RECFIELD and ARRAYELEM contain cached - * data, but we'd just as soon centralize the caching anyway). + * copy them (well, RECFIELD contains cached data, but we'd + * just as soon centralize the caching anyway). */ outdatum = indatum; break; @@ -4138,9 +4137,6 @@ plpgsql_estate_setup(PLpgSQL_execstate *estate, * * NB: the result of the evaluation is no longer valid after this is done, * unless it is a pass-by-value datatype. - * - * NB: if you change this code, see also the hacks in exec_assign_value's - * PLPGSQL_DTYPE_ARRAYELEM case for partial cleanup after subscript evals. * ---------- */ static void @@ -5290,198 +5286,6 @@ exec_assign_value(PLpgSQL_execstate *estate, break; } - case PLPGSQL_DTYPE_ARRAYELEM: - { - /* - * Target is an element of an array - */ - PLpgSQL_arrayelem *arrayelem; - int nsubscripts; - int i; - PLpgSQL_expr *subscripts[MAXDIM]; - int subscriptvals[MAXDIM]; - Datum oldarraydatum, - newarraydatum, - coerced_value; - bool oldarrayisnull; - Oid parenttypoid; - int32 parenttypmod; - SPITupleTable *save_eval_tuptable; - MemoryContext oldcontext; - - /* - * We need to do subscript evaluation, which might require - * evaluating general expressions; and the caller might have - * done that too in order to prepare the input Datum. We have - * to save and restore the caller's SPI_execute result, if - * any. - */ - save_eval_tuptable = estate->eval_tuptable; - estate->eval_tuptable = NULL; - - /* - * To handle constructs like x[1][2] := something, we have to - * be prepared to deal with a chain of arrayelem datums. Chase - * back to find the base array datum, and save the subscript - * expressions as we go. (We are scanning right to left here, - * but want to evaluate the subscripts left-to-right to - * minimize surprises.) Note that arrayelem is left pointing - * to the leftmost arrayelem datum, where we will cache the - * array element type data. - */ - nsubscripts = 0; - do - { - arrayelem = (PLpgSQL_arrayelem *) target; - if (nsubscripts >= MAXDIM) - ereport(ERROR, - (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), - errmsg("number of array dimensions (%d) exceeds the maximum allowed (%d)", - nsubscripts + 1, MAXDIM))); - subscripts[nsubscripts++] = arrayelem->subscript; - target = estate->datums[arrayelem->arrayparentno]; - } while (target->dtype == PLPGSQL_DTYPE_ARRAYELEM); - - /* Fetch current value of array datum */ - exec_eval_datum(estate, target, - &parenttypoid, &parenttypmod, - &oldarraydatum, &oldarrayisnull); - - /* Update cached type data if necessary */ - if (arrayelem->parenttypoid != parenttypoid || - arrayelem->parenttypmod != parenttypmod) - { - Oid arraytypoid; - int32 arraytypmod = parenttypmod; - int16 arraytyplen; - Oid elemtypoid; - int16 elemtyplen; - bool elemtypbyval; - char elemtypalign; - - /* If target is domain over array, reduce to base type */ - arraytypoid = getBaseTypeAndTypmod(parenttypoid, - &arraytypmod); - - /* ... and identify the element type */ - elemtypoid = get_element_type(arraytypoid); - if (!OidIsValid(elemtypoid)) - ereport(ERROR, - (errcode(ERRCODE_DATATYPE_MISMATCH), - errmsg("subscripted object is not an array"))); - - /* Collect needed data about the types */ - arraytyplen = get_typlen(arraytypoid); - - get_typlenbyvalalign(elemtypoid, - &elemtyplen, - &elemtypbyval, - &elemtypalign); - - /* Now safe to update the cached data */ - arrayelem->parenttypoid = parenttypoid; - arrayelem->parenttypmod = parenttypmod; - arrayelem->arraytypoid = arraytypoid; - arrayelem->arraytypmod = arraytypmod; - arrayelem->arraytyplen = arraytyplen; - arrayelem->elemtypoid = elemtypoid; - arrayelem->elemtyplen = elemtyplen; - arrayelem->elemtypbyval = elemtypbyval; - arrayelem->elemtypalign = elemtypalign; - } - - /* - * Evaluate the subscripts, switch into left-to-right order. - * Like the expression built by ExecInitSubscriptingRef(), - * complain if any subscript is null. - */ - for (i = 0; i < nsubscripts; i++) - { - bool subisnull; - - subscriptvals[i] = - exec_eval_integer(estate, - subscripts[nsubscripts - 1 - i], - &subisnull); - if (subisnull) - ereport(ERROR, - (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), - errmsg("array subscript in assignment must not be null"))); - - /* - * Clean up in case the subscript expression wasn't - * simple. We can't do exec_eval_cleanup, but we can do - * this much (which is safe because the integer subscript - * value is surely pass-by-value), and we must do it in - * case the next subscript expression isn't simple either. - */ - if (estate->eval_tuptable != NULL) - SPI_freetuptable(estate->eval_tuptable); - estate->eval_tuptable = NULL; - } - - /* Now we can restore caller's SPI_execute result if any. */ - Assert(estate->eval_tuptable == NULL); - estate->eval_tuptable = save_eval_tuptable; - - /* Coerce source value to match array element type. */ - coerced_value = exec_cast_value(estate, - value, - &isNull, - valtype, - valtypmod, - arrayelem->elemtypoid, - arrayelem->arraytypmod); - - /* - * If the original array is null, cons up an empty array so - * that the assignment can proceed; we'll end with a - * one-element array containing just the assigned-to - * subscript. This only works for varlena arrays, though; for - * fixed-length array types we skip the assignment. We can't - * support assignment of a null entry into a fixed-length - * array, either, so that's a no-op too. This is all ugly but - * corresponds to the current behavior of execExpr*.c. - */ - if (arrayelem->arraytyplen > 0 && /* fixed-length array? */ - (oldarrayisnull || isNull)) - return; - - /* empty array, if any, and newarraydatum are short-lived */ - oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate)); - - if (oldarrayisnull) - oldarraydatum = PointerGetDatum(construct_empty_array(arrayelem->elemtypoid)); - - /* - * Build the modified array value. - */ - newarraydatum = array_set_element(oldarraydatum, - nsubscripts, - subscriptvals, - coerced_value, - isNull, - arrayelem->arraytyplen, - arrayelem->elemtyplen, - arrayelem->elemtypbyval, - arrayelem->elemtypalign); - - MemoryContextSwitchTo(oldcontext); - - /* - * Assign the new array to the base variable. It's never NULL - * at this point. Note that if the target is a domain, - * coercing the base array type back up to the domain will - * happen within exec_assign_value. - */ - exec_assign_value(estate, target, - newarraydatum, - false, - arrayelem->arraytypoid, - arrayelem->arraytypmod); - break; - } - default: elog(ERROR, "unrecognized dtype: %d", target->dtype); } @@ -5492,8 +5296,8 @@ exec_assign_value(PLpgSQL_execstate *estate, * * The type oid, typmod, value in Datum format, and null flag are returned. * - * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums; - * that's not needed because we never pass references to such datums to SPI. + * At present this doesn't handle PLpgSQL_expr datums; that's not needed + * because we never pass references to such datums to SPI. * * NOTE: the returned Datum points right at the stored value in the case of * pass-by-reference datatypes. Generally callers should take care not to diff --git a/src/pl/plpgsql/src/pl_funcs.c b/src/pl/plpgsql/src/pl_funcs.c index ee60ced583..17895872c0 100644 --- a/src/pl/plpgsql/src/pl_funcs.c +++ b/src/pl/plpgsql/src/pl_funcs.c @@ -768,9 +768,6 @@ plpgsql_free_function_memory(PLpgSQL_function *func) break; case PLPGSQL_DTYPE_RECFIELD: break; - case PLPGSQL_DTYPE_ARRAYELEM: - free_expr(((PLpgSQL_arrayelem *) d)->subscript); - break; default: elog(ERROR, "unrecognized data type: %d", d->dtype); } @@ -1704,12 +1701,6 @@ plpgsql_dumptree(PLpgSQL_function *func) ((PLpgSQL_recfield *) d)->fieldname, ((PLpgSQL_recfield *) d)->recparentno); break; - case PLPGSQL_DTYPE_ARRAYELEM: - printf("ARRAYELEM of VAR %d subscript ", - ((PLpgSQL_arrayelem *) d)->arrayparentno); - dump_expr(((PLpgSQL_arrayelem *) d)->subscript); - printf("\n"); - break; default: printf("??? unknown data type %d\n", d->dtype); } diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y index dece013e23..4e7df1c2bb 100644 --- a/src/pl/plpgsql/src/pl_gram.y +++ b/src/pl/plpgsql/src/pl_gram.y @@ -177,11 +177,10 @@ static void check_raise_parameters(PLpgSQL_stmt_raise *stmt); %type <list> decl_cursor_arglist %type <nsitem> decl_aliasitem -%type <expr> expr_until_semi expr_until_rightbracket +%type <expr> expr_until_semi %type <expr> expr_until_then expr_until_loop opt_expr_until_when %type <expr> opt_exitcond -%type <datum> assign_var %type <var> cursor_variable %type <datum> decl_cursor_arg %type <forvariable> for_variable @@ -1155,16 +1154,23 @@ getdiag_item : } ; -getdiag_target : assign_var +getdiag_target : T_DATUM { - if ($1->dtype == PLPGSQL_DTYPE_ROW || - $1->dtype == PLPGSQL_DTYPE_REC) + /* + * In principle we should support a getdiag_target + * that is an array element, but for now we don't, so + * just throw an error if next token is '['. + */ + if ($1.datum->dtype == PLPGSQL_DTYPE_ROW || + $1.datum->dtype == PLPGSQL_DTYPE_REC || + plpgsql_peek() == '[') ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), errmsg("\"%s\" is not a scalar variable", - ((PLpgSQL_variable *) $1)->refname), + NameOfDatum(&($1))), parser_errposition(@1))); - $$ = $1; + check_assignable($1.datum, @1); + $$ = $1.datum; } | T_WORD { @@ -1178,29 +1184,6 @@ getdiag_target : assign_var } ; - -assign_var : T_DATUM - { - check_assignable($1.datum, @1); - $$ = $1.datum; - } - | assign_var '[' expr_until_rightbracket - { - PLpgSQL_arrayelem *new; - - new = palloc0(sizeof(PLpgSQL_arrayelem)); - new->dtype = PLPGSQL_DTYPE_ARRAYELEM; - new->subscript = $3; - new->arrayparentno = $1->dno; - /* initialize cached type data to "not valid" */ - new->parenttypoid = InvalidOid; - - plpgsql_adddatum((PLpgSQL_datum *) new); - - $$ = (PLpgSQL_datum *) new; - } - ; - stmt_if : K_IF expr_until_then proc_sect stmt_elsifs stmt_else K_END K_IF ';' { PLpgSQL_stmt_if *new; @@ -2471,10 +2454,6 @@ expr_until_semi : { $$ = read_sql_expression(';', ";"); } ; -expr_until_rightbracket : - { $$ = read_sql_expression(']', "]"); } - ; - expr_until_then : { $$ = read_sql_expression(K_THEN, "THEN"); } ; @@ -3493,11 +3472,6 @@ check_assignable(PLpgSQL_datum *datum, int location) check_assignable(plpgsql_Datums[((PLpgSQL_recfield *) datum)->recparentno], location); break; - case PLPGSQL_DTYPE_ARRAYELEM: - /* assignable if parent array is */ - check_assignable(plpgsql_Datums[((PLpgSQL_arrayelem *) datum)->arrayparentno], - location); - break; default: elog(ERROR, "unrecognized dtype: %d", datum->dtype); break; diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h index d152a4354b..a7791dc490 100644 --- a/src/pl/plpgsql/src/plpgsql.h +++ b/src/pl/plpgsql/src/plpgsql.h @@ -64,7 +64,6 @@ typedef enum PLpgSQL_datum_type PLPGSQL_DTYPE_ROW, PLPGSQL_DTYPE_REC, PLPGSQL_DTYPE_RECFIELD, - PLPGSQL_DTYPE_ARRAYELEM, PLPGSQL_DTYPE_PROMISE } PLpgSQL_datum_type; @@ -261,7 +260,7 @@ typedef struct PLpgSQL_expr * Generic datum array item * * PLpgSQL_datum is the common supertype for PLpgSQL_var, PLpgSQL_row, - * PLpgSQL_rec, PLpgSQL_recfield, and PLpgSQL_arrayelem. + * PLpgSQL_rec, and PLpgSQL_recfield. */ typedef struct PLpgSQL_datum { @@ -422,30 +421,6 @@ typedef struct PLpgSQL_recfield /* if rectupledescid == INVALID_TUPLEDESC_IDENTIFIER, finfo isn't valid */ } PLpgSQL_recfield; -/* - * Element of array variable - */ -typedef struct PLpgSQL_arrayelem -{ - PLpgSQL_datum_type dtype; - int dno; - /* end of PLpgSQL_datum fields */ - - PLpgSQL_expr *subscript; - int arrayparentno; /* dno of parent array variable */ - - /* Remaining fields are cached info about the array variable's type */ - Oid parenttypoid; /* type of array variable; 0 if not yet set */ - int32 parenttypmod; /* typmod of array variable */ - Oid arraytypoid; /* OID of actual array type */ - int32 arraytypmod; /* typmod of array (and its elements too) */ - int16 arraytyplen; /* typlen of array type */ - Oid elemtypoid; /* OID of array element type */ - int16 elemtyplen; /* typlen of element type */ - bool elemtypbyval; /* element type is pass-by-value? */ - char elemtypalign; /* typalign of element type */ -} PLpgSQL_arrayelem; - /* * Item in the compilers namespace tree */ diff --git a/src/test/regress/expected/plpgsql.out b/src/test/regress/expected/plpgsql.out index 6ae1b6e3d4..5a0bd1610c 100644 --- a/src/test/regress/expected/plpgsql.out +++ b/src/test/regress/expected/plpgsql.out @@ -4230,7 +4230,6 @@ drop function tftest(int); create or replace function rttest() returns setof int as $$ declare rc int; - rca int[]; begin return query values(10),(20); get diagnostics rc = row_count; @@ -4239,12 +4238,11 @@ begin get diagnostics rc = row_count; raise notice '% %', found, rc; return query execute 'values(10),(20)'; - -- just for fun, let's use array elements as targets - get diagnostics rca[1] = row_count; - raise notice '% %', found, rca[1]; + get diagnostics rc = row_count; + raise notice '% %', found, rc; return query execute 'select * from (values(10),(20)) f(a) where false'; - get diagnostics rca[2] = row_count; - raise notice '% %', found, rca[2]; + get diagnostics rc = row_count; + raise notice '% %', found, rc; end; $$ language plpgsql; select * from rttest(); diff --git a/src/test/regress/sql/plpgsql.sql b/src/test/regress/sql/plpgsql.sql index 07c60c80e4..781666a83a 100644 --- a/src/test/regress/sql/plpgsql.sql +++ b/src/test/regress/sql/plpgsql.sql @@ -3497,7 +3497,6 @@ drop function tftest(int); create or replace function rttest() returns setof int as $$ declare rc int; - rca int[]; begin return query values(10),(20); get diagnostics rc = row_count; @@ -3506,12 +3505,11 @@ begin get diagnostics rc = row_count; raise notice '% %', found, rc; return query execute 'values(10),(20)'; - -- just for fun, let's use array elements as targets - get diagnostics rca[1] = row_count; - raise notice '% %', found, rca[1]; + get diagnostics rc = row_count; + raise notice '% %', found, rc; return query execute 'select * from (values(10),(20)) f(a) where false'; - get diagnostics rca[2] = row_count; - raise notice '% %', found, rca[2]; + get diagnostics rc = row_count; + raise notice '% %', found, rc; end; $$ language plpgsql;
ne 13. 12. 2020 v 22:41 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
I wrote:
> So my idea here is to add a parsing-mode option to raw_parser(),
> which would be an enum with values like "normal SQL statement",
> "expression only", "type name", "plpgsql assignment statement".
Here's a fleshed-out patch series that attacks things that way.
I'm a lot better pleased with this than with my original approach.
0001 creates the basic infrastructure for "raw parse modes", and as
proof of concept simplifies typeStringToTypeName(). There's a minor
functional improvement there, which is that we can now use the core
parser's error cursor position, so instead of
regression=# do $$ declare x int[23/] ; begin end $$;
ERROR: syntax error at or near "/"
LINE 1: do $$ declare x int[23/] ; begin end $$;
^
CONTEXT: invalid type name "int[23/] "
you get
regression=# do $$ declare x int[23/] ; begin end $$;
ERROR: syntax error at or near "/"
LINE 1: do $$ declare x int[23/] ; begin end $$;
^
CONTEXT: invalid type name "int[23/] "
It's possible we could dispense with the error context callback
in typeStringToTypeName altogether, but I've not experimented much.
0002 tackles the next problem, which is to make this feature accessible
through SPI. There are a couple of possibly-controversial choices here.
Following the principle that we should avoid changing documented SPI
interfaces, we need a new version of SPI_prepare to pass RawParseMode
through. This'll be the fourth one :-(, so I decided it was time to
try to make a definition that can stay API-compatible through future
changes. So it takes a struct of options, and I added a promise that
zeroing the struct is enough to guarantee forward compatibility
through future additions.
This leaves both of the previous iterations, SPI_prepare_cursor
and SPI_prepare_params, unused anywhere in the core code.
I suppose we can't kill them (codesearch.debian.net knows of some
external uses) but I propose to mark them deprecated, with an eye
to at least removing their documentation someday.
I did not want to add a RawParseMode parameter to pg_parse_query(),
because that would have affected a larger number of unrelated modules,
and it would not have been great from a header-inclusion footprint
standpoint either. So I chose to pass down the mode from SPI by
having it just call raw_parser() directly instead of going through
pg_parse_query(). Perhaps this is a modularity violation, or perhaps
there's somebody who really wants the extra tracing overhead in
pg_parse_query() to apply to SPI queries. I'm open to discussing
whether this should be done differently.
(However, having made these two patches, I'm now wondering whether
there is any rhyme or reason to the existing state of affairs
with some callers going through pg_parse_query() while others use
raw_parser() directly. It's hard to knock making a different
choice in spi.c unless we have a coherent policy about which to
use where.)
Next, 0003 invents a raw parse mode for plpgsql expressions (which,
in some contexts, can be pretty nearly whole SELECT statements),
and uses that to get plpgsql out of the business of prefixing
"SELECT " to user-written text. I would not have bothered with this
as a standalone fix, but I think it does make for less-confusing
error messages --- we've definitely had novices ask "where'd this
SELECT come from?" in the past. (I cheated a bit on PERFORM, though.
Unlike other places, it needs to allow UNION, so it can't use the
same restricted syntax.)
0004 then reimplements plpgsql assignment. This is essentially the same
patch I submitted before, but redesigned to work with the infrastructure
from 0001-0003.
0005 adds documentation and test cases. It also fixes a couple
of pre-existing problems that the plpgsql parser had with assigning
to sub-fields of record fields, which I discovered while making the
tests.
Finally, 0006 removes plpgsql's ARRAYELEM datum type, on the grounds
that we don't need it anymore. This might be a little controversial
too, because there was still one way to reach the code: GET DIAGNOSTICS
with an array element as target would do so. However, that seems like
a pretty weird corner case. Reviewing the git history, I find that
I added support for that in commit 55caaaeba; but a check of the
associated discussion shows that there was no actual user request for
that, I'd just done it because it was easy and seemed more symmetric.
The amount of code involved here seems way more than is justified by
that one case, so I think we should just take it out and lose the
"feature". (I did think about whether GET DIAGNOSTICS could be
reimplemented on top of the new infrastructure, but it wouldn't be
easy because we don't have a SQL-expression representation of the
GET DIAGNOSTICS values. Moreover, going in that direction would add
an expression evaluation, making GET DIAGNOSTICS slower. So I think
we should just drop it.)
It is a really great patch. I did fast check and I didn't find any functionality issue
--
-- Name: footype; Type: TYPE; Schema: public; Owner: pavel
--
CREATE TYPE public.footype AS (
a integer,
b integer
);
ALTER TYPE public.footype OWNER TO pavel;
--
-- Name: bootype; Type: TYPE; Schema: public; Owner: pavel
--
CREATE TYPE public.bootype AS (
a integer,
f public.footype
);
ALTER TYPE public.bootype OWNER TO pavel;
--
-- Name: cootype; Type: TYPE; Schema: public; Owner: pavel
--
CREATE TYPE public.cootype AS (
a integer,
b integer[]
);
ALTER TYPE public.cootype OWNER TO pavel;
--
-- Name: dootype; Type: TYPE; Schema: public; Owner: pavel
--
CREATE TYPE public.dootype AS (
a integer,
b public.footype,
c public.footype[]
);
ALTER TYPE public.dootype OWNER TO pavel;
--
-- PostgreSQL database dump complete
--
-- Name: footype; Type: TYPE; Schema: public; Owner: pavel
--
CREATE TYPE public.footype AS (
a integer,
b integer
);
ALTER TYPE public.footype OWNER TO pavel;
--
-- Name: bootype; Type: TYPE; Schema: public; Owner: pavel
--
CREATE TYPE public.bootype AS (
a integer,
f public.footype
);
ALTER TYPE public.bootype OWNER TO pavel;
--
-- Name: cootype; Type: TYPE; Schema: public; Owner: pavel
--
CREATE TYPE public.cootype AS (
a integer,
b integer[]
);
ALTER TYPE public.cootype OWNER TO pavel;
--
-- Name: dootype; Type: TYPE; Schema: public; Owner: pavel
--
CREATE TYPE public.dootype AS (
a integer,
b public.footype,
c public.footype[]
);
ALTER TYPE public.dootype OWNER TO pavel;
--
-- PostgreSQL database dump complete
--
postgres=# do $$
<<lab>>
declare
a footype[];
b bootype;
ba bootype[];
c cootype[];
d dootype[];
x int default 1;
begin
a[10] := row(10,20);
a[11] := (30,40);
a[3] := (0,0);
a[3].a := 100;
raise notice '%', a;
b.a := 100;
b.f.a := 1000;
raise notice '%', b;
ba[0] := b;
ba[0].a = 33; ba[0].f := row(33,33);
lab.ba[0].f.a := 1000000;
raise notice '%', ba;
c[0].a := 10000;
c[0].b := ARRAY[1,2,4];
lab.c[0].b[1] := 10000;
raise notice '% %', c, c[0].b[x];
d[0].a := 100;
d[0].b.a := 101;
d[0].c[x+1].a := 102;
raise notice '%', d;
end;
$$;
NOTICE: [3:11]={"(100,0)",NULL,NULL,NULL,NULL,NULL,NULL,"(10,20)","(30,40)"}
NOTICE: (100,"(1000,)")
NOTICE: [0:0]={"(33,\"(1000000,33)\")"}
NOTICE: [0:0]={"(10000,\"{10000,2,4}\")"} 10000
NOTICE: [0:0]={"(100,\"(101,)\",\"[2:2]={\"\"(102,)\"\"}\")"}
DO
<<lab>>
declare
a footype[];
b bootype;
ba bootype[];
c cootype[];
d dootype[];
x int default 1;
begin
a[10] := row(10,20);
a[11] := (30,40);
a[3] := (0,0);
a[3].a := 100;
raise notice '%', a;
b.a := 100;
b.f.a := 1000;
raise notice '%', b;
ba[0] := b;
ba[0].a = 33; ba[0].f := row(33,33);
lab.ba[0].f.a := 1000000;
raise notice '%', ba;
c[0].a := 10000;
c[0].b := ARRAY[1,2,4];
lab.c[0].b[1] := 10000;
raise notice '% %', c, c[0].b[x];
d[0].a := 100;
d[0].b.a := 101;
d[0].c[x+1].a := 102;
raise notice '%', d;
end;
$$;
NOTICE: [3:11]={"(100,0)",NULL,NULL,NULL,NULL,NULL,NULL,"(10,20)","(30,40)"}
NOTICE: (100,"(1000,)")
NOTICE: [0:0]={"(33,\"(1000000,33)\")"}
NOTICE: [0:0]={"(10000,\"{10000,2,4}\")"} 10000
NOTICE: [0:0]={"(100,\"(101,)\",\"[2:2]={\"\"(102,)\"\"}\")"}
DO
Regards
Pavel
regards, tom lane
Hi
I checked a performance and it looks so access to record's field is faster, but an access to arrays field is significantly slower
do $$
declare
a int[];
aux int;
rep boolean default true;
begin
for i in 1..5000
loop
a[i]:= 5000 - i;
end loop;
raise notice '%', a[1:10];
while rep
loop
rep := false;
for i in 1..5000
loop
if a[i] > a[i+1] then
aux := a[i];
a[i] := a[i+1]; a[i+1] := aux;
rep := true;
end if;
end loop;
end loop;
raise notice '%', a[1:10];
end;
$$;
declare
a int[];
aux int;
rep boolean default true;
begin
for i in 1..5000
loop
a[i]:= 5000 - i;
end loop;
raise notice '%', a[1:10];
while rep
loop
rep := false;
for i in 1..5000
loop
if a[i] > a[i+1] then
aux := a[i];
a[i] := a[i+1]; a[i+1] := aux;
rep := true;
end if;
end loop;
end loop;
raise notice '%', a[1:10];
end;
$$;
This code is about 3x slower than master (40 sec x 12 sec). I believe so this is a worst case scenario
I tested pi calculation
CREATE OR REPLACE FUNCTION pi_est_1(n int)
RETURNS numeric AS $$
DECLARE
accum double precision DEFAULT 1.0;
c1 double precision DEFAULT 2.0;
c2 double precision DEFAULT 1.0;
BEGIN
FOR i IN 1..n
LOOP
accum := accum * ((c1 * c1) / (c2 * (c2 + 2.0)));
c1 := c1 + 2.0;
c2 := c2 + 2.0;
END LOOP;
RETURN accum * 2.0;
END;
$$ LANGUAGE plpgsql;
RETURNS numeric AS $$
DECLARE
accum double precision DEFAULT 1.0;
c1 double precision DEFAULT 2.0;
c2 double precision DEFAULT 1.0;
BEGIN
FOR i IN 1..n
LOOP
accum := accum * ((c1 * c1) / (c2 * (c2 + 2.0)));
c1 := c1 + 2.0;
c2 := c2 + 2.0;
END LOOP;
RETURN accum * 2.0;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION pi_est_2(n int)
RETURNS numeric AS $$
DECLARE
accum double precision DEFAULT 1.0;
c1 double precision DEFAULT 2.0;
c2 double precision DEFAULT 1.0;
BEGIN
FOR i IN 1..n
LOOP
accum := accum * ((c1 * c1) / (c2 * (c2 + double precision '2.0')));
c1 := c1 + double precision '2.0';
c2 := c2 + double precision '2.0';
END LOOP;
RETURN accum * double precision '2.0';
END;
$$ LANGUAGE plpgsql;
RETURNS numeric AS $$
DECLARE
accum double precision DEFAULT 1.0;
c1 double precision DEFAULT 2.0;
c2 double precision DEFAULT 1.0;
BEGIN
FOR i IN 1..n
LOOP
accum := accum * ((c1 * c1) / (c2 * (c2 + double precision '2.0')));
c1 := c1 + double precision '2.0';
c2 := c2 + double precision '2.0';
END LOOP;
RETURN accum * double precision '2.0';
END;
$$ LANGUAGE plpgsql;
And the performance is 10% slower than on master
Interesting point - the master is about 5% faster than pg13
Pavel Stehule <pavel.stehule@gmail.com> writes: > I checked a performance and it looks so access to record's field is faster, > but an access to arrays field is significantly slower Hmm, I'd drawn the opposite conclusion in my own testing ... > for i in 1..5000 > loop > if a[i] > a[i+1] then > aux := a[i]; > a[i] := a[i+1]; a[i+1] := aux; > rep := true; > end if; > end loop; ... but I now see that I'd not checked cases like "a[i] := a[j]". exec_check_rw_parameter() is being too conservative about whether it can optimize a case like that. The attached incremental patch fixes it. > I tested pi calculation > ... > And the performance is 10% slower than on master Can't reproduce that here. For the record, I get the following timings (medians of three runs) for your test cases: HEAD: sort: Time: 13974.709 ms (00:13.975) pi_est_1(10000000): Time: 3537.482 ms (00:03.537) pi_est_2(10000000): Time: 3546.557 ms (00:03.547) Patch v1: sort: Time: 47053.892 ms (00:47.054) pi_est_1(10000000): Time: 3456.078 ms (00:03.456) pi_est_2(10000000): Time: 3451.347 ms (00:03.451) + exec_check_rw_parameter fix: sort: Time: 12199.724 ms (00:12.200) pi_est_1(10000000): Time: 3357.955 ms (00:03.358) pi_est_2(10000000): Time: 3367.526 ms (00:03.368) I'm inclined to think that the differences in the pi calculation timings are mostly chance effects; there's certainly no reason why exec_check_rw_parameter should affect that test case at all. regards, tom lane diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index 4c8a739bc4..15cb3b312f 100644 --- a/src/backend/utils/adt/arrayfuncs.c +++ b/src/backend/utils/adt/arrayfuncs.c @@ -2583,7 +2583,9 @@ array_set_element_expanded(Datum arraydatum, /* * Copy new element into array's context, if needed (we assume it's * already detoasted, so no junk should be created). If we fail further - * down, this memory is leaked, but that's reasonably harmless. + * down, this memory is leaked, but that's reasonably harmless. Note in + * particular that doing this early ensures sanity in case the source + * Datum is a pointer to a pass-by-ref element of this same array. */ if (!eah->typbyval && !isNull) { diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index 5c1db1dcfb..1378f40d4d 100644 --- a/src/pl/plpgsql/src/pl_exec.c +++ b/src/pl/plpgsql/src/pl_exec.c @@ -8181,14 +8181,7 @@ exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno) contains_target_param((Node *) sbsref->refexpr, &target_dno)) return; - /* the other subexpressions must not contain target */ - if (contains_target_param((Node *) sbsref->refupperindexpr, - &target_dno) || - contains_target_param((Node *) sbsref->reflowerindexpr, - &target_dno) || - contains_target_param((Node *) sbsref->refassgnexpr, - &target_dno)) - return; + /* we do *not* need to restrict the subscripts or source expression */ /* OK, we can pass target as a read-write parameter */ expr->rwparam = target_dno;
po 14. 12. 2020 v 17:25 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I checked a performance and it looks so access to record's field is faster,
> but an access to arrays field is significantly slower
Hmm, I'd drawn the opposite conclusion in my own testing ...
> for i in 1..5000
> loop
> if a[i] > a[i+1] then
> aux := a[i];
> a[i] := a[i+1]; a[i+1] := aux;
> rep := true;
> end if;
> end loop;
... but I now see that I'd not checked cases like "a[i] := a[j]".
exec_check_rw_parameter() is being too conservative about whether
it can optimize a case like that. The attached incremental patch
fixes it.
> I tested pi calculation
> ...
> And the performance is 10% slower than on master
Can't reproduce that here. For the record, I get the following
timings (medians of three runs) for your test cases:
HEAD:
sort: Time: 13974.709 ms (00:13.975)
pi_est_1(10000000): Time: 3537.482 ms (00:03.537)
pi_est_2(10000000): Time: 3546.557 ms (00:03.547)
Patch v1:
sort: Time: 47053.892 ms (00:47.054)
pi_est_1(10000000): Time: 3456.078 ms (00:03.456)
pi_est_2(10000000): Time: 3451.347 ms (00:03.451)
+ exec_check_rw_parameter fix:
sort: Time: 12199.724 ms (00:12.200)
pi_est_1(10000000): Time: 3357.955 ms (00:03.358)
pi_est_2(10000000): Time: 3367.526 ms (00:03.368)
I'm inclined to think that the differences in the pi calculation
timings are mostly chance effects; there's certainly no reason
why exec_check_rw_parameter should affect that test case at all.
performance patch helps lot of for sort - with patch it is faster 5-10% than master 10864 x 12122 ms
I used
CFLAGS="-fno-omit-frame-pointer -Wall -Wmissing-prototypes -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -Werror=switch"
With these options the pi test was slower. When I used default, then there is no difference.
So it can be very good feature, new code has same speed or it is faster
Regards
Pavel
regards, tom lane
I realized that the speedup patch I posted yesterday is flawed: it's too aggressive about applying the R/W param mechanism, instead of not aggressive enough. To review, the point of that logic is that if we have an assignment like arrayvar := array_append(arrayvar, some-scalar-expression); a naive implementation would have array_append construct an entire new array, which we'd then have to copy into plpgsql's variable storage. Instead, if the array variable is in expanded-array format (which plpgsql encourages it to be) then we can pass the array parameter as a "read/write expanded datum", which array_append recognizes as license to scribble right on its input and return the modified input; that takes only O(1) time not O(N). Then plpgsql's assignment code notices that the expression result datum is the same pointer already stored in the variable, so it does nothing. With the patch at hand, a subscripted assignment a[i] := x becomes, essentially, a := subscriptingref(a, i, x); and we need to make the same sort of transformation to allow array_set_element to scribble right on the original value of "a" instead of making a copy. However, we can't simply not consider the source expression "x", as I proposed yesterday. For example, if we have a := subscriptingref(a, i, f(array_append(a, x))); it's not okay for array_append() to scribble on "a". The R/W param mechanism normally disallows any additional references to the target variable, which would prevent this error, but I broke that safety check with the 0007 patch. After thinking about this awhile, I decided that plpgsql's R/W param mechanism is really misdesigned. Instead of requiring the assignment source expression to be such that *all* its references to the target variable could be passed as R/W, we really want to identify *one* reference to the target variable to be passed as R/W, allowing any other ones to be passed read/only as they would be by default. As long as the R/W reference is a direct argument to the top-level (hence last to be executed) function in the expression, there is no harm in R/O references being passed to other lower parts of the expression. Nor is there any use-case for more than one argument of the top-level function being R/W. So the attached rewrite of the 0007 patch reimplements that logic to identify one single Param that references the target variable, and make only that Param pass a read/write reference, not any other Params referencing the target variable. This is a good change even without considering the assignment-reimplementation proposal, because even before this patchset we could have cases like arrayvar := array_append(arrayvar, arrayvar[i]); The existing code would be afraid to optimize this, but it's in fact safe. I also re-attach the 0001-0006 patches, which have not changed, just to keep the cfbot happy. regards, tom lane diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 1fa9f19f08..9e7f1590b5 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -12095,7 +12095,7 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd, * parse_analyze() or the rewriter, but instead we need to pass them * through parse_utilcmd.c to make them ready for execution. */ - raw_parsetree_list = raw_parser(cmd); + raw_parsetree_list = raw_parser(cmd, RAW_PARSE_DEFAULT); querytree_list = NIL; foreach(list_item, raw_parsetree_list) { diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 8f341ac006..88c76dd985 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -723,6 +723,15 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); */ %token NOT_LA NULLS_LA WITH_LA +/* + * The grammar likewise thinks these tokens are keywords, but they are never + * generated by the scanner. Rather, they can be injected by parser.c as + * the initial token of the string (using the lookahead-token mechanism + * implemented there). This provides a way to tell the grammar to parse + * something other than the usual list of SQL commands. + */ +%token MODE_TYPE_NAME + /* Precedence: lowest to highest */ %nonassoc SET /* see relation_expr_opt_alias */ @@ -787,11 +796,18 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); /* * The target production for the whole parse. + * + * Ordinarily we parse a list of statements, but if we see one of the + * special MODE_XXX symbols as first token, we parse something else. */ stmtblock: stmtmulti { pg_yyget_extra(yyscanner)->parsetree = $1; } + | MODE_TYPE_NAME Typename + { + pg_yyget_extra(yyscanner)->parsetree = list_make1($2); + } ; /* diff --git a/src/backend/parser/parse_type.c b/src/backend/parser/parse_type.c index 2709f6f9c7..633774b733 100644 --- a/src/backend/parser/parse_type.c +++ b/src/backend/parser/parse_type.c @@ -719,13 +719,6 @@ pts_error_callback(void *arg) const char *str = (const char *) arg; errcontext("invalid type name \"%s\"", str); - - /* - * Currently we just suppress any syntax error position report, rather - * than transforming to an "internal query" error. It's unlikely that a - * type name is complex enough to need positioning. - */ - errposition(0); } /* @@ -737,11 +730,7 @@ pts_error_callback(void *arg) TypeName * typeStringToTypeName(const char *str) { - StringInfoData buf; List *raw_parsetree_list; - SelectStmt *stmt; - ResTarget *restarget; - TypeCast *typecast; TypeName *typeName; ErrorContextCallback ptserrcontext; @@ -749,9 +738,6 @@ typeStringToTypeName(const char *str) if (strspn(str, " \t\n\r\f") == strlen(str)) goto fail; - initStringInfo(&buf); - appendStringInfo(&buf, "SELECT NULL::%s", str); - /* * Setup error traceback support in case of ereport() during parse */ @@ -760,58 +746,18 @@ typeStringToTypeName(const char *str) ptserrcontext.previous = error_context_stack; error_context_stack = &ptserrcontext; - raw_parsetree_list = raw_parser(buf.data); + raw_parsetree_list = raw_parser(str, RAW_PARSE_TYPE_NAME); error_context_stack = ptserrcontext.previous; - /* - * Make sure we got back exactly what we expected and no more; paranoia is - * justified since the string might contain anything. - */ - if (list_length(raw_parsetree_list) != 1) - goto fail; - stmt = (SelectStmt *) linitial_node(RawStmt, raw_parsetree_list)->stmt; - if (stmt == NULL || - !IsA(stmt, SelectStmt) || - stmt->distinctClause != NIL || - stmt->intoClause != NULL || - stmt->fromClause != NIL || - stmt->whereClause != NULL || - stmt->groupClause != NIL || - stmt->havingClause != NULL || - stmt->windowClause != NIL || - stmt->valuesLists != NIL || - stmt->sortClause != NIL || - stmt->limitOffset != NULL || - stmt->limitCount != NULL || - stmt->lockingClause != NIL || - stmt->withClause != NULL || - stmt->op != SETOP_NONE) - goto fail; - if (list_length(stmt->targetList) != 1) - goto fail; - restarget = (ResTarget *) linitial(stmt->targetList); - if (restarget == NULL || - !IsA(restarget, ResTarget) || - restarget->name != NULL || - restarget->indirection != NIL) - goto fail; - typecast = (TypeCast *) restarget->val; - if (typecast == NULL || - !IsA(typecast, TypeCast) || - typecast->arg == NULL || - !IsA(typecast->arg, A_Const)) - goto fail; + /* We should get back exactly one TypeName node. */ + Assert(list_length(raw_parsetree_list) == 1); + typeName = linitial_node(TypeName, raw_parsetree_list); - typeName = typecast->typeName; - if (typeName == NULL || - !IsA(typeName, TypeName)) - goto fail; + /* The grammar allows SETOF in TypeName, but we don't want that here. */ if (typeName->setof) goto fail; - pfree(buf.data); - return typeName; fail: diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c index be86eb37fe..b04be3bca5 100644 --- a/src/backend/parser/parser.c +++ b/src/backend/parser/parser.c @@ -35,11 +35,11 @@ static char *str_udeescape(const char *str, char escape, * raw_parser * Given a query in string form, do lexical and grammatical analysis. * - * Returns a list of raw (un-analyzed) parse trees. The immediate elements - * of the list are always RawStmt nodes. + * Returns a list of raw (un-analyzed) parse trees. The contents of the + * list have the form required by the specified RawParseMode. */ List * -raw_parser(const char *str) +raw_parser(const char *str, RawParseMode mode) { core_yyscan_t yyscanner; base_yy_extra_type yyextra; @@ -49,8 +49,22 @@ raw_parser(const char *str) yyscanner = scanner_init(str, &yyextra.core_yy_extra, &ScanKeywords, ScanKeywordTokens); - /* base_yylex() only needs this much initialization */ - yyextra.have_lookahead = false; + /* base_yylex() only needs us to initialize the lookahead token, if any */ + if (mode == RAW_PARSE_DEFAULT) + yyextra.have_lookahead = false; + else + { + /* this array is indexed by RawParseMode enum */ + static const int mode_token[] = { + 0, /* RAW_PARSE_DEFAULT */ + MODE_TYPE_NAME /* RAW_PARSE_TYPE_NAME */ + }; + + yyextra.have_lookahead = true; + yyextra.lookahead_token = mode_token[mode]; + yyextra.lookahead_yylloc = 0; + yyextra.lookahead_end = NULL; + } /* initialize the bison parser */ parser_init(&yyextra); @@ -104,7 +118,8 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner) cur_token = yyextra->lookahead_token; lvalp->core_yystype = yyextra->lookahead_yylval; *llocp = yyextra->lookahead_yylloc; - *(yyextra->lookahead_end) = yyextra->lookahead_hold_char; + if (yyextra->lookahead_end) + *(yyextra->lookahead_end) = yyextra->lookahead_hold_char; yyextra->have_lookahead = false; } else diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 3679799e50..64c06f615b 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -634,7 +634,7 @@ pg_parse_query(const char *query_string) if (log_parser_stats) ResetUsage(); - raw_parsetree_list = raw_parser(query_string); + raw_parsetree_list = raw_parser(query_string, RAW_PARSE_DEFAULT); if (log_parser_stats) ShowUsage("PARSER STATISTICS"); diff --git a/src/include/parser/parser.h b/src/include/parser/parser.h index 3bdeeb8b0b..ebbcb788cc 100644 --- a/src/include/parser/parser.h +++ b/src/include/parser/parser.h @@ -18,6 +18,24 @@ #include "nodes/parsenodes.h" +/* + * RawParseMode determines the form of the string that raw_parser() accepts: + * + * RAW_PARSE_DEFAULT: parse a semicolon-separated list of SQL commands, + * and return a List of RawStmt nodes. + * + * RAW_PARSE_TYPE_NAME: parse a type name, and return a one-element List + * containing a TypeName node. + * + * ... more to come ... + */ +typedef enum +{ + RAW_PARSE_DEFAULT = 0, + RAW_PARSE_TYPE_NAME +} RawParseMode; + +/* Values for the backslash_quote GUC */ typedef enum { BACKSLASH_QUOTE_OFF, @@ -32,7 +50,7 @@ extern PGDLLIMPORT bool standard_conforming_strings; /* Primary entry point for the raw parsing functions */ -extern List *raw_parser(const char *str); +extern List *raw_parser(const char *str, RawParseMode mode); /* Utility functions exported by gram.y (perhaps these should be elsewhere) */ extern List *SystemFuncName(char *name); diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y index 8227bf0449..26732d875e 100644 --- a/src/pl/plpgsql/src/pl_gram.y +++ b/src/pl/plpgsql/src/pl_gram.y @@ -3661,7 +3661,7 @@ check_sql_expr(const char *stmt, int location, int leaderlen) error_context_stack = &syntax_errcontext; oldCxt = MemoryContextSwitchTo(plpgsql_compile_tmp_cxt); - (void) raw_parser(stmt); + (void) raw_parser(stmt, RAW_PARSE_DEFAULT); MemoryContextSwitchTo(oldCxt); /* Restore former ereport callback */ diff --git a/doc/src/sgml/spi.sgml b/doc/src/sgml/spi.sgml index 7752de0a4d..f0dcba946e 100644 --- a/doc/src/sgml/spi.sgml +++ b/doc/src/sgml/spi.sgml @@ -1105,6 +1105,11 @@ SPIPlanPtr SPI_prepare_cursor(const char * <parameter>command</parameter>, int < for the <structfield>options</structfield> field of <structname>DeclareCursorStmt</structname>. <function>SPI_prepare</function> always takes the cursor options as zero. </para> + + <para> + This function is now deprecated in favor + of <function>SPI_prepare_extended</function>. + </para> </refsect1> <refsect1> @@ -1176,6 +1181,122 @@ SPIPlanPtr SPI_prepare_cursor(const char * <parameter>command</parameter>, int < <!-- *********************************************** --> +<refentry id="spi-spi-prepare-extended"> + <indexterm><primary>SPI_prepare_extended</primary></indexterm> + + <refmeta> + <refentrytitle>SPI_prepare_extended</refentrytitle> + <manvolnum>3</manvolnum> + </refmeta> + + <refnamediv> + <refname>SPI_prepare_extended</refname> + <refpurpose>prepare a statement, without executing it yet</refpurpose> + </refnamediv> + + <refsynopsisdiv> +<synopsis> +SPIPlanPtr SPI_prepare_extended(const char * <parameter>command</parameter>, + const SPIPrepareOptions * <parameter>options</parameter>) +</synopsis> + </refsynopsisdiv> + + <refsect1> + <title>Description</title> + + <para> + <function>SPI_prepare_extended</function> creates and returns a prepared + statement for the specified command, but doesn't execute the command. + This function is equivalent to <function>SPI_prepare</function>, + with the addition that the caller can specify options to control + the parsing of external parameter references, as well as other facets + of query parsing and planning. + </para> + </refsect1> + + <refsect1> + <title>Arguments</title> + + <variablelist> + <varlistentry> + <term><literal>const char * <parameter>command</parameter></literal></term> + <listitem> + <para> + command string + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>const SPIPrepareOptions * <parameter>options</parameter></literal></term> + <listitem> + <para> + struct containing optional arguments + </para> + </listitem> + </varlistentry> + </variablelist> + + <para> + Callers should always zero out the entire <parameter>options</parameter> + struct, then fill whichever fields they want to set. This ensures forward + compatibility of code, since any fields that are added to the struct in + future will be defined to behave backwards-compatibly if they are zero. + The currently available <parameter>options</parameter> fields are: + </para> + + <variablelist> + <varlistentry> + <term><literal>ParserSetupHook <parameter>parserSetup</parameter></literal></term> + <listitem> + <para> + Parser hook setup function + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>void * <parameter>parserSetupArg</parameter></literal></term> + <listitem> + <para> + pass-through argument for <parameter>parserSetup</parameter> + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>RawParseMode <parameter>parseMode</parameter></literal></term> + <listitem> + <para> + mode for raw parsing; <literal>RAW_PARSE_DEFAULT</literal> (zero) + produces default behavior + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>int <parameter>cursorOptions</parameter></literal></term> + <listitem> + <para> + integer bit mask of cursor options; zero produces default behavior + </para> + </listitem> + </varlistentry> + </variablelist> + </refsect1> + + <refsect1> + <title>Return Value</title> + + <para> + <function>SPI_prepare_extended</function> has the same return conventions as + <function>SPI_prepare</function>. + </para> + </refsect1> +</refentry> + +<!-- *********************************************** --> + <refentry id="spi-spi-prepare-params"> <indexterm><primary>SPI_prepare_params</primary></indexterm> @@ -1208,6 +1329,11 @@ SPIPlanPtr SPI_prepare_params(const char * <parameter>command</parameter>, with the addition that the caller can specify parser hook functions to control the parsing of external parameter references. </para> + + <para> + This function is now deprecated in favor + of <function>SPI_prepare_extended</function>. + </para> </refsect1> <refsect1> diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index 055ebb77ae..3acf88e67c 100644 --- a/src/backend/executor/spi.c +++ b/src/backend/executor/spi.c @@ -508,6 +508,7 @@ SPI_execute(const char *src, bool read_only, long tcount) memset(&plan, 0, sizeof(_SPI_plan)); plan.magic = _SPI_PLAN_MAGIC; + plan.parse_mode = RAW_PARSE_DEFAULT; plan.cursor_options = CURSOR_OPT_PARALLEL_OK; _SPI_prepare_oneshot_plan(src, &plan); @@ -681,6 +682,7 @@ SPI_execute_with_args(const char *src, memset(&plan, 0, sizeof(_SPI_plan)); plan.magic = _SPI_PLAN_MAGIC; + plan.parse_mode = RAW_PARSE_DEFAULT; plan.cursor_options = CURSOR_OPT_PARALLEL_OK; plan.nargs = nargs; plan.argtypes = argtypes; @@ -726,6 +728,7 @@ SPI_execute_with_receiver(const char *src, memset(&plan, 0, sizeof(_SPI_plan)); plan.magic = _SPI_PLAN_MAGIC; + plan.parse_mode = RAW_PARSE_DEFAULT; plan.cursor_options = CURSOR_OPT_PARALLEL_OK; if (params) { @@ -768,6 +771,7 @@ SPI_prepare_cursor(const char *src, int nargs, Oid *argtypes, memset(&plan, 0, sizeof(_SPI_plan)); plan.magic = _SPI_PLAN_MAGIC; + plan.parse_mode = RAW_PARSE_DEFAULT; plan.cursor_options = cursorOptions; plan.nargs = nargs; plan.argtypes = argtypes; @@ -784,6 +788,42 @@ SPI_prepare_cursor(const char *src, int nargs, Oid *argtypes, return result; } +SPIPlanPtr +SPI_prepare_extended(const char *src, + const SPIPrepareOptions *options) +{ + _SPI_plan plan; + SPIPlanPtr result; + + if (src == NULL || options == NULL) + { + SPI_result = SPI_ERROR_ARGUMENT; + return NULL; + } + + SPI_result = _SPI_begin_call(true); + if (SPI_result < 0) + return NULL; + + memset(&plan, 0, sizeof(_SPI_plan)); + plan.magic = _SPI_PLAN_MAGIC; + plan.parse_mode = options->parseMode; + plan.cursor_options = options->cursorOptions; + plan.nargs = 0; + plan.argtypes = NULL; + plan.parserSetup = options->parserSetup; + plan.parserSetupArg = options->parserSetupArg; + + _SPI_prepare_plan(src, &plan); + + /* copy plan to procedure context */ + result = _SPI_make_plan_non_temp(&plan); + + _SPI_end_call(true); + + return result; +} + SPIPlanPtr SPI_prepare_params(const char *src, ParserSetupHook parserSetup, @@ -805,6 +845,7 @@ SPI_prepare_params(const char *src, memset(&plan, 0, sizeof(_SPI_plan)); plan.magic = _SPI_PLAN_MAGIC; + plan.parse_mode = RAW_PARSE_DEFAULT; plan.cursor_options = cursorOptions; plan.nargs = 0; plan.argtypes = NULL; @@ -1340,6 +1381,7 @@ SPI_cursor_open_with_args(const char *name, memset(&plan, 0, sizeof(_SPI_plan)); plan.magic = _SPI_PLAN_MAGIC; + plan.parse_mode = RAW_PARSE_DEFAULT; plan.cursor_options = cursorOptions; plan.nargs = nargs; plan.argtypes = argtypes; @@ -1400,6 +1442,7 @@ SPI_cursor_parse_open_with_paramlist(const char *name, memset(&plan, 0, sizeof(_SPI_plan)); plan.magic = _SPI_PLAN_MAGIC; + plan.parse_mode = RAW_PARSE_DEFAULT; plan.cursor_options = cursorOptions; if (params) { @@ -2036,7 +2079,8 @@ spi_printtup(TupleTableSlot *slot, DestReceiver *self) * Parse and analyze a querystring. * * At entry, plan->argtypes and plan->nargs (or alternatively plan->parserSetup - * and plan->parserSetupArg) must be valid, as must plan->cursor_options. + * and plan->parserSetupArg) must be valid, as must plan->parse_mode and + * plan->cursor_options. * * Results are stored into *plan (specifically, plan->plancache_list). * Note that the result data is all in CurrentMemoryContext or child contexts @@ -2063,7 +2107,7 @@ _SPI_prepare_plan(const char *src, SPIPlanPtr plan) /* * Parse the request string into a list of raw parse trees. */ - raw_parsetree_list = pg_parse_query(src); + raw_parsetree_list = raw_parser(src, plan->parse_mode); /* * Do parse analysis and rule rewrite for each raw parsetree, storing the @@ -2168,7 +2212,7 @@ _SPI_prepare_oneshot_plan(const char *src, SPIPlanPtr plan) /* * Parse the request string into a list of raw parse trees. */ - raw_parsetree_list = pg_parse_query(src); + raw_parsetree_list = raw_parser(src, plan->parse_mode); /* * Construct plancache entries, but don't do parse analysis yet. @@ -2866,6 +2910,7 @@ _SPI_make_plan_non_temp(SPIPlanPtr plan) newplan = (SPIPlanPtr) palloc0(sizeof(_SPI_plan)); newplan->magic = _SPI_PLAN_MAGIC; newplan->plancxt = plancxt; + newplan->parse_mode = plan->parse_mode; newplan->cursor_options = plan->cursor_options; newplan->nargs = plan->nargs; if (plan->nargs > 0) @@ -2930,6 +2975,7 @@ _SPI_save_plan(SPIPlanPtr plan) newplan = (SPIPlanPtr) palloc0(sizeof(_SPI_plan)); newplan->magic = _SPI_PLAN_MAGIC; newplan->plancxt = plancxt; + newplan->parse_mode = plan->parse_mode; newplan->cursor_options = plan->cursor_options; newplan->nargs = plan->nargs; if (plan->nargs > 0) diff --git a/src/include/executor/spi.h b/src/include/executor/spi.h index 896ec0a2ad..8ff4236c8f 100644 --- a/src/include/executor/spi.h +++ b/src/include/executor/spi.h @@ -15,7 +15,7 @@ #include "commands/trigger.h" #include "lib/ilist.h" -#include "nodes/parsenodes.h" +#include "parser/parser.h" #include "utils/portal.h" @@ -33,6 +33,15 @@ typedef struct SPITupleTable SubTransactionId subid; /* subxact in which tuptable was created */ } SPITupleTable; +/* Optional arguments for SPI_prepare_extended */ +typedef struct SPIPrepareOptions +{ + ParserSetupHook parserSetup; + void *parserSetupArg; + RawParseMode parseMode; + int cursorOptions; +} SPIPrepareOptions; + /* Plans are opaque structs for standard users of SPI */ typedef struct _SPI_plan *SPIPlanPtr; @@ -113,6 +122,8 @@ extern int SPI_execute_with_receiver(const char *src, extern SPIPlanPtr SPI_prepare(const char *src, int nargs, Oid *argtypes); extern SPIPlanPtr SPI_prepare_cursor(const char *src, int nargs, Oid *argtypes, int cursorOptions); +extern SPIPlanPtr SPI_prepare_extended(const char *src, + const SPIPrepareOptions *options); extern SPIPlanPtr SPI_prepare_params(const char *src, ParserSetupHook parserSetup, void *parserSetupArg, diff --git a/src/include/executor/spi_priv.h b/src/include/executor/spi_priv.h index 6220928bd3..9c99aeb84e 100644 --- a/src/include/executor/spi_priv.h +++ b/src/include/executor/spi_priv.h @@ -95,6 +95,7 @@ typedef struct _SPI_plan bool no_snapshots; /* let the caller handle the snapshots */ List *plancache_list; /* one CachedPlanSource per parsetree */ MemoryContext plancxt; /* Context containing _SPI_plan and data */ + RawParseMode parse_mode; /* raw_parser() mode */ int cursor_options; /* Cursor options used for planning */ int nargs; /* number of plan arguments */ Oid *argtypes; /* Argument types (NULL if nargs is 0) */ diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index ccbc50fc45..86f5e9fd24 100644 --- a/src/pl/plpgsql/src/pl_exec.c +++ b/src/pl/plpgsql/src/pl_exec.c @@ -4170,6 +4170,7 @@ exec_prepare_plan(PLpgSQL_execstate *estate, bool keepplan) { SPIPlanPtr plan; + SPIPrepareOptions options; /* * The grammar can't conveniently set expr->func while building the parse @@ -4180,12 +4181,14 @@ exec_prepare_plan(PLpgSQL_execstate *estate, /* * Generate and save the plan */ - plan = SPI_prepare_params(expr->query, - (ParserSetupHook) plpgsql_parser_setup, - (void *) expr, - cursorOptions); + memset(&options, 0, sizeof(options)); + options.parserSetup = (ParserSetupHook) plpgsql_parser_setup; + options.parserSetupArg = (void *) expr; + options.parseMode = RAW_PARSE_DEFAULT; + options.cursorOptions = cursorOptions; + plan = SPI_prepare_extended(expr->query, &options); if (plan == NULL) - elog(ERROR, "SPI_prepare_params failed for \"%s\": %s", + elog(ERROR, "SPI_prepare_extended failed for \"%s\": %s", expr->query, SPI_result_code_string(SPI_result)); if (keepplan) SPI_keepplan(plan); diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index 3acf88e67c..f529707458 100644 --- a/src/backend/executor/spi.c +++ b/src/backend/executor/spi.c @@ -51,6 +51,12 @@ static _SPI_connection *_SPI_current = NULL; static int _SPI_stack_depth = 0; /* allocated size of _SPI_stack */ static int _SPI_connected = -1; /* current stack index */ +typedef struct SPICallbackArg +{ + const char *query; + RawParseMode mode; +} SPICallbackArg; + static Portal SPI_cursor_open_internal(const char *name, SPIPlanPtr plan, ParamListInfo paramLI, bool read_only); @@ -1479,6 +1485,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan, Snapshot snapshot; MemoryContext oldcontext; Portal portal; + SPICallbackArg spicallbackarg; ErrorContextCallback spierrcontext; /* @@ -1533,8 +1540,10 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan, * Setup error traceback support for ereport(), in case GetCachedPlan * throws an error. */ + spicallbackarg.query = plansource->query_string; + spicallbackarg.mode = plan->parse_mode; spierrcontext.callback = _SPI_error_callback; - spierrcontext.arg = unconstify(char *, plansource->query_string); + spierrcontext.arg = &spicallbackarg; spierrcontext.previous = error_context_stack; error_context_stack = &spierrcontext; @@ -1952,6 +1961,7 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan) { CachedPlanSource *plansource; CachedPlan *cplan; + SPICallbackArg spicallbackarg; ErrorContextCallback spierrcontext; Assert(plan->magic == _SPI_PLAN_MAGIC); @@ -1966,8 +1976,10 @@ SPI_plan_get_cached_plan(SPIPlanPtr plan) plansource = (CachedPlanSource *) linitial(plan->plancache_list); /* Setup error traceback support for ereport() */ + spicallbackarg.query = plansource->query_string; + spicallbackarg.mode = plan->parse_mode; spierrcontext.callback = _SPI_error_callback; - spierrcontext.arg = unconstify(char *, plansource->query_string); + spierrcontext.arg = &spicallbackarg; spierrcontext.previous = error_context_stack; error_context_stack = &spierrcontext; @@ -2094,13 +2106,16 @@ _SPI_prepare_plan(const char *src, SPIPlanPtr plan) List *raw_parsetree_list; List *plancache_list; ListCell *list_item; + SPICallbackArg spicallbackarg; ErrorContextCallback spierrcontext; /* * Setup error traceback support for ereport() */ + spicallbackarg.query = src; + spicallbackarg.mode = plan->parse_mode; spierrcontext.callback = _SPI_error_callback; - spierrcontext.arg = unconstify(char *, src); + spierrcontext.arg = &spicallbackarg; spierrcontext.previous = error_context_stack; error_context_stack = &spierrcontext; @@ -2199,13 +2214,16 @@ _SPI_prepare_oneshot_plan(const char *src, SPIPlanPtr plan) List *raw_parsetree_list; List *plancache_list; ListCell *list_item; + SPICallbackArg spicallbackarg; ErrorContextCallback spierrcontext; /* * Setup error traceback support for ereport() */ + spicallbackarg.query = src; + spicallbackarg.mode = plan->parse_mode; spierrcontext.callback = _SPI_error_callback; - spierrcontext.arg = unconstify(char *, src); + spierrcontext.arg = &spicallbackarg; spierrcontext.previous = error_context_stack; error_context_stack = &spierrcontext; @@ -2263,6 +2281,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI, SPITupleTable *my_tuptable = NULL; int res = 0; bool pushed_active_snap = false; + SPICallbackArg spicallbackarg; ErrorContextCallback spierrcontext; CachedPlan *cplan = NULL; ListCell *lc1; @@ -2270,8 +2289,10 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI, /* * Setup error traceback support for ereport() */ + spicallbackarg.query = NULL; /* we'll fill this below */ + spicallbackarg.mode = plan->parse_mode; spierrcontext.callback = _SPI_error_callback; - spierrcontext.arg = NULL; /* we'll fill this below */ + spierrcontext.arg = &spicallbackarg; spierrcontext.previous = error_context_stack; error_context_stack = &spierrcontext; @@ -2318,7 +2339,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI, List *stmt_list; ListCell *lc2; - spierrcontext.arg = unconstify(char *, plansource->query_string); + spicallbackarg.query = plansource->query_string; /* * If this is a one-shot plan, we still need to do parse analysis. @@ -2722,7 +2743,8 @@ _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount) static void _SPI_error_callback(void *arg) { - const char *query = (const char *) arg; + SPICallbackArg *carg = (SPICallbackArg *) arg; + const char *query = carg->query; int syntaxerrposition; if (query == NULL) /* in case arg wasn't set yet */ @@ -2740,7 +2762,18 @@ _SPI_error_callback(void *arg) internalerrquery(query); } else - errcontext("SQL statement \"%s\"", query); + { + /* Use the parse mode to decide how to describe the query */ + switch (carg->mode) + { + case RAW_PARSE_PLPGSQL_EXPR: + errcontext("SQL expression \"%s\"", query); + break; + default: + errcontext("SQL statement \"%s\"", query); + break; + } + } } /* diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 88c76dd985..4c58b46651 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -294,6 +294,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); %type <node> select_no_parens select_with_parens select_clause simple_select values_clause + PLpgSQL_Expr %type <node> alter_column_default opclass_item opclass_drop alter_using %type <ival> add_drop opt_asc_desc opt_nulls_order @@ -731,6 +732,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); * something other than the usual list of SQL commands. */ %token MODE_TYPE_NAME +%token MODE_PLPGSQL_EXPR /* Precedence: lowest to highest */ @@ -808,6 +810,11 @@ stmtblock: stmtmulti { pg_yyget_extra(yyscanner)->parsetree = list_make1($2); } + | MODE_PLPGSQL_EXPR PLpgSQL_Expr + { + pg_yyget_extra(yyscanner)->parsetree = + list_make1(makeRawStmt($2, 0)); + } ; /* @@ -15022,6 +15029,47 @@ role_list: RoleSpec { $$ = lappend($1, $3); } ; + +/***************************************************************************** + * + * PL/pgSQL extensions + * + * You'd think a PL/pgSQL "expression" should be just an a_expr, but + * historically it can include just about anything that can follow SELECT. + * Therefore the returned struct is a SelectStmt. + *****************************************************************************/ + +PLpgSQL_Expr: opt_target_list + from_clause where_clause + group_clause having_clause window_clause + opt_sort_clause opt_select_limit opt_for_locking_clause + { + SelectStmt *n = makeNode(SelectStmt); + + n->targetList = $1; + n->fromClause = $2; + n->whereClause = $3; + n->groupClause = $4; + n->havingClause = $5; + n->windowClause = $6; + n->sortClause = $7; + if ($8) + { + n->limitOffset = $8->limitOffset; + n->limitCount = $8->limitCount; + if (!n->sortClause && + $8->limitOption == LIMIT_OPTION_WITH_TIES) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("WITH TIES cannot be specified without ORDER BY clause"))); + n->limitOption = $8->limitOption; + } + n->lockingClause = $9; + $$ = (Node *) n; + } + ; + + /* * Name classification hierarchy. * diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c index b04be3bca5..71df8ef022 100644 --- a/src/backend/parser/parser.c +++ b/src/backend/parser/parser.c @@ -57,7 +57,8 @@ raw_parser(const char *str, RawParseMode mode) /* this array is indexed by RawParseMode enum */ static const int mode_token[] = { 0, /* RAW_PARSE_DEFAULT */ - MODE_TYPE_NAME /* RAW_PARSE_TYPE_NAME */ + MODE_TYPE_NAME, /* RAW_PARSE_TYPE_NAME */ + MODE_PLPGSQL_EXPR /* RAW_PARSE_PLPGSQL_EXPR */ }; yyextra.have_lookahead = true; diff --git a/src/include/parser/parser.h b/src/include/parser/parser.h index ebbcb788cc..3b7dab17ef 100644 --- a/src/include/parser/parser.h +++ b/src/include/parser/parser.h @@ -27,12 +27,14 @@ * RAW_PARSE_TYPE_NAME: parse a type name, and return a one-element List * containing a TypeName node. * - * ... more to come ... + * RAW_PARSE_PLPGSQL_EXPR: parse a PL/pgSQL expression, and return + * a one-element List containing a RawStmt node. */ typedef enum { RAW_PARSE_DEFAULT = 0, - RAW_PARSE_TYPE_NAME + RAW_PARSE_TYPE_NAME, + RAW_PARSE_PLPGSQL_EXPR } RawParseMode; /* Values for the backslash_quote GUC */ diff --git a/src/interfaces/ecpg/preproc/parse.pl b/src/interfaces/ecpg/preproc/parse.pl index 52ba7dfa0c..2b254c2b77 100644 --- a/src/interfaces/ecpg/preproc/parse.pl +++ b/src/interfaces/ecpg/preproc/parse.pl @@ -70,7 +70,8 @@ my %replace_types = ( 'ColId' => 'ignore', 'type_function_name' => 'ignore', 'ColLabel' => 'ignore', - 'Sconst' => 'ignore',); + 'Sconst' => 'ignore', + 'PLpgSQL_Expr' => 'ignore',); # these replace_line commands excise certain keywords from the core keyword # lists. Be sure to account for these in ColLabel and related productions. diff --git a/src/pl/plpgsql/src/expected/plpgsql_record.out b/src/pl/plpgsql/src/expected/plpgsql_record.out index cf6089cbb2..52207e9b10 100644 --- a/src/pl/plpgsql/src/expected/plpgsql_record.out +++ b/src/pl/plpgsql/src/expected/plpgsql_record.out @@ -188,7 +188,7 @@ NOTICE: r1.q1 = <NULL> NOTICE: r1.q2 = <NULL> NOTICE: r1 = <NULL> ERROR: record "r1" has no field "nosuchfield" -CONTEXT: SQL statement "SELECT r1.nosuchfield" +CONTEXT: SQL expression "r1.nosuchfield" PL/pgSQL function inline_code_block line 7 at RAISE -- records, not so much do $$ @@ -202,7 +202,7 @@ end$$; NOTICE: r1 = <NULL> ERROR: record "r1" is not assigned yet DETAIL: The tuple structure of a not-yet-assigned record is indeterminate. -CONTEXT: SQL statement "SELECT r1.f1" +CONTEXT: SQL expression "r1.f1" PL/pgSQL function inline_code_block line 5 at RAISE -- but OK if you assign first do $$ @@ -220,7 +220,7 @@ NOTICE: r1.f1 = 1 NOTICE: r1.f2 = 2 NOTICE: r1 = (1,2) ERROR: record "r1" has no field "nosuchfield" -CONTEXT: SQL statement "SELECT r1.nosuchfield" +CONTEXT: SQL expression "r1.nosuchfield" PL/pgSQL function inline_code_block line 9 at RAISE -- check repeated assignments to composite fields create table some_table (id int, data text); @@ -431,7 +431,7 @@ create function getf3(x mutable) returns int language plpgsql as $$ begin return x.f3; end $$; select getf3(null::mutable); -- doesn't work yet ERROR: record "x" has no field "f3" -CONTEXT: SQL statement "SELECT x.f3" +CONTEXT: SQL expression "x.f3" PL/pgSQL function getf3(mutable) line 1 at RETURN alter table mutable add column f3 int; select getf3(null::mutable); -- now it works diff --git a/src/pl/plpgsql/src/expected/plpgsql_varprops.out b/src/pl/plpgsql/src/expected/plpgsql_varprops.out index 18f03d75b4..3801dccc95 100644 --- a/src/pl/plpgsql/src/expected/plpgsql_varprops.out +++ b/src/pl/plpgsql/src/expected/plpgsql_varprops.out @@ -76,7 +76,7 @@ begin raise notice 'x = %', x; end$$; ERROR: division by zero -CONTEXT: SQL statement "SELECT 1/0" +CONTEXT: SQL expression "1/0" PL/pgSQL function inline_code_block line 3 during statement block local variable initialization do $$ declare x bigint[] := array[1,3,5]; diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index 86f5e9fd24..f56dcd0e79 100644 --- a/src/pl/plpgsql/src/pl_exec.c +++ b/src/pl/plpgsql/src/pl_exec.c @@ -4184,7 +4184,7 @@ exec_prepare_plan(PLpgSQL_execstate *estate, memset(&options, 0, sizeof(options)); options.parserSetup = (ParserSetupHook) plpgsql_parser_setup; options.parserSetupArg = (void *) expr; - options.parseMode = RAW_PARSE_DEFAULT; + options.parseMode = expr->parseMode; options.cursorOptions = cursorOptions; plan = SPI_prepare_extended(expr->query, &options); if (plan == NULL) diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y index 26732d875e..3b36220d73 100644 --- a/src/pl/plpgsql/src/pl_gram.y +++ b/src/pl/plpgsql/src/pl_gram.y @@ -51,7 +51,6 @@ typedef struct { int location; - int leaderlen; } sql_error_callback_arg; #define parser_errposition(pos) plpgsql_scanner_errposition(pos) @@ -67,7 +66,7 @@ static PLpgSQL_expr *read_sql_construct(int until, int until2, int until3, const char *expected, - const char *sqlstart, + RawParseMode parsemode, bool isexpression, bool valid_sql, bool trim, @@ -78,7 +77,7 @@ static PLpgSQL_expr *read_sql_expression(int until, static PLpgSQL_expr *read_sql_expression2(int until, int until2, const char *expected, int *endtoken); -static PLpgSQL_expr *read_sql_stmt(const char *sqlstart); +static PLpgSQL_expr *read_sql_stmt(void); static PLpgSQL_type *read_datatype(int tok); static PLpgSQL_stmt *make_execsql_stmt(int firsttoken, int location); static PLpgSQL_stmt_fetch *read_fetch_direction(void); @@ -99,8 +98,8 @@ static PLpgSQL_row *read_into_scalar_list(char *initial_name, static PLpgSQL_row *make_scalar_list1(char *initial_name, PLpgSQL_datum *initial_datum, int lineno, int location); -static void check_sql_expr(const char *stmt, int location, - int leaderlen); +static void check_sql_expr(const char *stmt, + RawParseMode parseMode, int location); static void plpgsql_sql_error_callback(void *arg); static PLpgSQL_type *parse_datatype(const char *string, int location); static void check_labels(const char *start_label, @@ -540,7 +539,7 @@ decl_statement : decl_varname decl_const decl_datatype decl_collate decl_notnull { PLpgSQL_var *new; PLpgSQL_expr *curname_def; - char buf[1024]; + char buf[NAMEDATALEN * 2 + 64]; char *cp1; char *cp2; @@ -557,9 +556,9 @@ decl_statement : decl_varname decl_const decl_datatype decl_collate decl_notnull curname_def = palloc0(sizeof(PLpgSQL_expr)); - strcpy(buf, "SELECT "); + /* Note: refname has been truncated to NAMEDATALEN */ cp1 = new->refname; - cp2 = buf + strlen(buf); + cp2 = buf; /* * Don't trust standard_conforming_strings here; * it might change before we use the string. @@ -575,6 +574,7 @@ decl_statement : decl_varname decl_const decl_datatype decl_collate decl_notnull } strcpy(cp2, "'::pg_catalog.refcursor"); curname_def->query = pstrdup(buf); + curname_def->parseMode = RAW_PARSE_PLPGSQL_EXPR; new->default_val = curname_def; new->cursor_explicit_expr = $7; @@ -602,7 +602,7 @@ opt_scrollable : decl_cursor_query : { - $$ = read_sql_stmt(""); + $$ = read_sql_stmt(); } ; @@ -904,15 +904,37 @@ proc_stmt : pl_block ';' { $$ = $1; } ; -stmt_perform : K_PERFORM expr_until_semi +stmt_perform : K_PERFORM { PLpgSQL_stmt_perform *new; + int startloc; new = palloc0(sizeof(PLpgSQL_stmt_perform)); new->cmd_type = PLPGSQL_STMT_PERFORM; new->lineno = plpgsql_location_to_lineno(@1); new->stmtid = ++plpgsql_curr_compile->nstatements; - new->expr = $2; + plpgsql_push_back_token(K_PERFORM); + + /* + * Since PERFORM isn't legal SQL, we have to cheat to + * the extent of substituting "SELECT" for "PERFORM" + * in the parsed text. It does not seem worth + * inventing a separate parse mode for this one case. + * We can't do syntax-checking until after we make the + * substitution. + */ + new->expr = read_sql_construct(';', 0, 0, ";", + RAW_PARSE_DEFAULT, + false, false, true, + &startloc, NULL); + /* overwrite "perform" ... */ + memcpy(new->expr->query, " SELECT", 7); + /* left-justify to get rid of the leading space */ + memmove(new->expr->query, new->expr->query + 1, + strlen(new->expr->query)); + /* offset syntax error position to account for that */ + check_sql_expr(new->expr->query, new->expr->parseMode, + startloc + 1); $$ = (PLpgSQL_stmt *)new; } @@ -926,7 +948,8 @@ stmt_call : K_CALL new->cmd_type = PLPGSQL_STMT_CALL; new->lineno = plpgsql_location_to_lineno(@1); new->stmtid = ++plpgsql_curr_compile->nstatements; - new->expr = read_sql_stmt("CALL "); + plpgsql_push_back_token(K_CALL); + new->expr = read_sql_stmt(); new->is_call = true; $$ = (PLpgSQL_stmt *)new; @@ -941,7 +964,8 @@ stmt_call : K_CALL new->cmd_type = PLPGSQL_STMT_CALL; new->lineno = plpgsql_location_to_lineno(@1); new->stmtid = ++plpgsql_curr_compile->nstatements; - new->expr = read_sql_stmt("DO "); + plpgsql_push_back_token(K_DO); + new->expr = read_sql_stmt(); new->is_call = false; $$ = (PLpgSQL_stmt *)new; @@ -1452,16 +1476,16 @@ for_control : for_variable K_IN /* * Read tokens until we see either a ".." - * or a LOOP. The text we read may not - * necessarily be a well-formed SQL - * statement, so we need to invoke - * read_sql_construct directly. + * or a LOOP. The text we read may be either + * an expression or a whole SQL statement, so + * we need to invoke read_sql_construct directly, + * and tell it not to check syntax yet. */ expr1 = read_sql_construct(DOT_DOT, K_LOOP, 0, "LOOP", - "SELECT ", + RAW_PARSE_DEFAULT, true, false, true, @@ -1476,8 +1500,13 @@ for_control : for_variable K_IN PLpgSQL_var *fvar; PLpgSQL_stmt_fori *new; - /* Check first expression is well-formed */ - check_sql_expr(expr1->query, expr1loc, 7); + /* + * Relabel first expression as an expression; + * then we can check its syntax. + */ + expr1->parseMode = RAW_PARSE_PLPGSQL_EXPR; + check_sql_expr(expr1->query, expr1->parseMode, + expr1loc); /* Read and check the second one */ expr2 = read_sql_expression2(K_LOOP, K_BY, @@ -1522,12 +1551,8 @@ for_control : for_variable K_IN else { /* - * No "..", so it must be a query loop. We've - * prefixed an extra SELECT to the query text, - * so we need to remove that before performing - * syntax checking. + * No "..", so it must be a query loop. */ - char *tmp_query; PLpgSQL_stmt_fors *new; if (reverse) @@ -1536,12 +1561,9 @@ for_control : for_variable K_IN errmsg("cannot specify REVERSE in query FOR loop"), parser_errposition(tokloc))); - Assert(strncmp(expr1->query, "SELECT ", 7) == 0); - tmp_query = pstrdup(expr1->query + 7); - pfree(expr1->query); - expr1->query = tmp_query; - - check_sql_expr(expr1->query, expr1loc, 0); + /* Check syntax as a regular query */ + check_sql_expr(expr1->query, expr1->parseMode, + expr1loc); new = palloc0(sizeof(PLpgSQL_stmt_fors)); new->cmd_type = PLPGSQL_STMT_FORS; @@ -1870,7 +1892,7 @@ stmt_raise : K_RAISE expr = read_sql_construct(',', ';', K_USING, ", or ; or USING", - "SELECT ", + RAW_PARSE_PLPGSQL_EXPR, true, true, true, NULL, &tok); new->params = lappend(new->params, expr); @@ -2001,7 +2023,7 @@ stmt_dynexecute : K_EXECUTE expr = read_sql_construct(K_INTO, K_USING, ';', "INTO or USING or ;", - "SELECT ", + RAW_PARSE_PLPGSQL_EXPR, true, true, true, NULL, &endtoken); @@ -2040,7 +2062,7 @@ stmt_dynexecute : K_EXECUTE { expr = read_sql_construct(',', ';', K_INTO, ", or ; or INTO", - "SELECT ", + RAW_PARSE_PLPGSQL_EXPR, true, true, true, NULL, &endtoken); new->params = lappend(new->params, expr); @@ -2122,7 +2144,7 @@ stmt_open : K_OPEN cursor_variable else { plpgsql_push_back_token(tok); - new->query = read_sql_stmt(""); + new->query = read_sql_stmt(); } } else @@ -2246,8 +2268,8 @@ stmt_set : K_SET new->cmd_type = PLPGSQL_STMT_SET; new->lineno = plpgsql_location_to_lineno(@1); new->stmtid = ++plpgsql_curr_compile->nstatements; - - new->expr = read_sql_stmt("SET "); + plpgsql_push_back_token(K_SET); + new->expr = read_sql_stmt(); $$ = (PLpgSQL_stmt *)new; } @@ -2259,7 +2281,8 @@ stmt_set : K_SET new->cmd_type = PLPGSQL_STMT_SET; new->lineno = plpgsql_location_to_lineno(@1); new->stmtid = ++plpgsql_curr_compile->nstatements; - new->expr = read_sql_stmt("RESET "); + plpgsql_push_back_token(K_RESET); + new->expr = read_sql_stmt(); $$ = (PLpgSQL_stmt *)new; } @@ -2656,7 +2679,8 @@ static PLpgSQL_expr * read_sql_expression(int until, const char *expected) { return read_sql_construct(until, 0, 0, expected, - "SELECT ", true, true, true, NULL, NULL); + RAW_PARSE_PLPGSQL_EXPR, + true, true, true, NULL, NULL); } /* Convenience routine to read an expression with two possible terminators */ @@ -2665,15 +2689,17 @@ read_sql_expression2(int until, int until2, const char *expected, int *endtoken) { return read_sql_construct(until, until2, 0, expected, - "SELECT ", true, true, true, NULL, endtoken); + RAW_PARSE_PLPGSQL_EXPR, + true, true, true, NULL, endtoken); } /* Convenience routine to read a SQL statement that must end with ';' */ static PLpgSQL_expr * -read_sql_stmt(const char *sqlstart) +read_sql_stmt(void) { return read_sql_construct(';', 0, 0, ";", - sqlstart, false, true, true, NULL, NULL); + RAW_PARSE_DEFAULT, + false, true, true, NULL, NULL); } /* @@ -2683,9 +2709,9 @@ read_sql_stmt(const char *sqlstart) * until2: token code for alternate terminator (pass 0 if none) * until3: token code for another alternate terminator (pass 0 if none) * expected: text to use in complaining that terminator was not found - * sqlstart: text to prefix to the accumulated SQL text + * parsemode: raw_parser() mode to use * isexpression: whether to say we're reading an "expression" or a "statement" - * valid_sql: whether to check the syntax of the expr (prefixed with sqlstart) + * valid_sql: whether to check the syntax of the expr * trim: trim trailing whitespace * startloc: if not NULL, location of first token is stored at *startloc * endtoken: if not NULL, ending token is stored at *endtoken @@ -2696,7 +2722,7 @@ read_sql_construct(int until, int until2, int until3, const char *expected, - const char *sqlstart, + RawParseMode parsemode, bool isexpression, bool valid_sql, bool trim, @@ -2711,7 +2737,6 @@ read_sql_construct(int until, PLpgSQL_expr *expr; initStringInfo(&ds); - appendStringInfoString(&ds, sqlstart); /* special lookup mode for identifiers within the SQL text */ save_IdentifierLookup = plpgsql_IdentifierLookup; @@ -2787,6 +2812,7 @@ read_sql_construct(int until, expr = palloc0(sizeof(PLpgSQL_expr)); expr->query = pstrdup(ds.data); + expr->parseMode = parsemode; expr->plan = NULL; expr->paramnos = NULL; expr->rwparam = -1; @@ -2794,7 +2820,7 @@ read_sql_construct(int until, pfree(ds.data); if (valid_sql) - check_sql_expr(expr->query, startlocation, strlen(sqlstart)); + check_sql_expr(expr->query, expr->parseMode, startlocation); return expr; } @@ -3033,13 +3059,14 @@ make_execsql_stmt(int firsttoken, int location) expr = palloc0(sizeof(PLpgSQL_expr)); expr->query = pstrdup(ds.data); + expr->parseMode = RAW_PARSE_DEFAULT; expr->plan = NULL; expr->paramnos = NULL; expr->rwparam = -1; expr->ns = plpgsql_ns_top(); pfree(ds.data); - check_sql_expr(expr->query, location, 0); + check_sql_expr(expr->query, expr->parseMode, location); execsql = palloc(sizeof(PLpgSQL_stmt_execsql)); execsql->cmd_type = PLPGSQL_STMT_EXECSQL; @@ -3382,7 +3409,7 @@ make_return_query_stmt(int location) { /* ordinary static query */ plpgsql_push_back_token(tok); - new->query = read_sql_stmt(""); + new->query = read_sql_stmt(); } else { @@ -3637,13 +3664,12 @@ make_scalar_list1(char *initial_name, * borders. So it is best to bail out as early as we can. * * It is assumed that "stmt" represents a copy of the function source text - * beginning at offset "location", with leader text of length "leaderlen" - * (typically "SELECT ") prefixed to the source text. We use this assumption - * to transpose any error cursor position back to the function source text. + * beginning at offset "location". We use this assumption to transpose + * any error cursor position back to the function source text. * If no error cursor is provided, we'll just point at "location". */ static void -check_sql_expr(const char *stmt, int location, int leaderlen) +check_sql_expr(const char *stmt, RawParseMode parseMode, int location) { sql_error_callback_arg cbarg; ErrorContextCallback syntax_errcontext; @@ -3653,7 +3679,6 @@ check_sql_expr(const char *stmt, int location, int leaderlen) return; cbarg.location = location; - cbarg.leaderlen = leaderlen; syntax_errcontext.callback = plpgsql_sql_error_callback; syntax_errcontext.arg = &cbarg; @@ -3661,7 +3686,7 @@ check_sql_expr(const char *stmt, int location, int leaderlen) error_context_stack = &syntax_errcontext; oldCxt = MemoryContextSwitchTo(plpgsql_compile_tmp_cxt); - (void) raw_parser(stmt, RAW_PARSE_DEFAULT); + (void) raw_parser(stmt, parseMode); MemoryContextSwitchTo(oldCxt); /* Restore former ereport callback */ @@ -3686,12 +3711,12 @@ plpgsql_sql_error_callback(void *arg) * Note we are dealing with 1-based character numbers at this point. */ errpos = geterrposition(); - if (errpos > cbarg->leaderlen) + if (errpos > 0) { int myerrpos = getinternalerrposition(); if (myerrpos > 0) /* safety check */ - internalerrposition(myerrpos + errpos - cbarg->leaderlen - 1); + internalerrposition(myerrpos + errpos - 1); } /* In any case, flush errposition --- we want internalerrposition only */ @@ -3717,7 +3742,6 @@ parse_datatype(const char *string, int location) ErrorContextCallback syntax_errcontext; cbarg.location = location; - cbarg.leaderlen = 0; syntax_errcontext.callback = plpgsql_sql_error_callback; syntax_errcontext.arg = &cbarg; @@ -3780,7 +3804,6 @@ read_cursor_args(PLpgSQL_var *cursor, int until) int argc; char **argv; StringInfoData ds; - char *sqlstart = "SELECT "; bool any_named = false; tok = yylex(); @@ -3881,12 +3904,12 @@ read_cursor_args(PLpgSQL_var *cursor, int until) */ item = read_sql_construct(',', ')', 0, ",\" or \")", - sqlstart, + RAW_PARSE_PLPGSQL_EXPR, true, true, false, /* do not trim */ NULL, &endtoken); - argv[argpos] = item->query + strlen(sqlstart); + argv[argpos] = item->query; if (endtoken == ')' && !(argc == row->nfields - 1)) ereport(ERROR, @@ -3905,7 +3928,6 @@ read_cursor_args(PLpgSQL_var *cursor, int until) /* Make positional argument list */ initStringInfo(&ds); - appendStringInfoString(&ds, sqlstart); for (argc = 0; argc < row->nfields; argc++) { Assert(argv[argc] != NULL); @@ -3921,10 +3943,10 @@ read_cursor_args(PLpgSQL_var *cursor, int until) if (argc < row->nfields - 1) appendStringInfoString(&ds, ", "); } - appendStringInfoChar(&ds, ';'); expr = palloc0(sizeof(PLpgSQL_expr)); expr->query = pstrdup(ds.data); + expr->parseMode = RAW_PARSE_PLPGSQL_EXPR; expr->plan = NULL; expr->paramnos = NULL; expr->rwparam = -1; @@ -4097,14 +4119,14 @@ make_case(int location, PLpgSQL_expr *t_expr, PLpgSQL_expr *expr = cwt->expr; StringInfoData ds; - /* copy expression query without SELECT keyword (expr->query + 7) */ - Assert(strncmp(expr->query, "SELECT ", 7) == 0); + /* We expect to have expressions not statements */ + Assert(expr->parseMode == RAW_PARSE_PLPGSQL_EXPR); - /* And do the string hacking */ + /* Do the string hacking */ initStringInfo(&ds); - appendStringInfo(&ds, "SELECT \"%s\" IN (%s)", - varname, expr->query + 7); + appendStringInfo(&ds, "\"%s\" IN (%s)", + varname, expr->query); pfree(expr->query); expr->query = pstrdup(ds.data); diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h index 0c3d30fb13..d152a4354b 100644 --- a/src/pl/plpgsql/src/plpgsql.h +++ b/src/pl/plpgsql/src/plpgsql.h @@ -218,8 +218,9 @@ typedef struct PLpgSQL_type */ typedef struct PLpgSQL_expr { - char *query; - SPIPlanPtr plan; + char *query; /* query string, verbatim from function body */ + RawParseMode parseMode; /* raw_parser() mode to use */ + SPIPlanPtr plan; /* plan, or NULL if not made yet */ Bitmapset *paramnos; /* all dnos referenced by this query */ int rwparam; /* dno of read/write param, or -1 if none */ diff --git a/src/test/regress/expected/plpgsql.out b/src/test/regress/expected/plpgsql.out index d0a6b630b8..6ae1b6e3d4 100644 --- a/src/test/regress/expected/plpgsql.out +++ b/src/test/regress/expected/plpgsql.out @@ -1761,10 +1761,10 @@ select f1(42) as int, f1(4.5) as num; select f1(point(3,4)); -- fail for lack of + operator ERROR: operator does not exist: point + integer -LINE 1: SELECT x + 1 - ^ +LINE 1: x + 1 + ^ HINT: No operator matches the given name and argument types. You might need to add explicit type casts. -QUERY: SELECT x + 1 +QUERY: x + 1 CONTEXT: PL/pgSQL function f1(anyelement) line 3 at RETURN drop function f1(x anyelement); create function f1(x anyelement) returns anyarray as $$ @@ -2361,7 +2361,7 @@ begin end $$ language plpgsql; select namedparmcursor_test7(); ERROR: division by zero -CONTEXT: SQL statement "SELECT 42/0 AS p1, 77 AS p2;" +CONTEXT: SQL expression "42/0 AS p1, 77 AS p2" PL/pgSQL function namedparmcursor_test7() line 6 at OPEN -- check that line comments work correctly within the argument list (there -- is some special handling of this case in the code: the newline after the @@ -2574,9 +2574,9 @@ end; $$ language plpgsql; -- blocks select excpt_test1(); ERROR: column "sqlstate" does not exist -LINE 1: SELECT sqlstate - ^ -QUERY: SELECT sqlstate +LINE 1: sqlstate + ^ +QUERY: sqlstate CONTEXT: PL/pgSQL function excpt_test1() line 3 at RAISE create function excpt_test2() returns void as $$ begin @@ -2589,9 +2589,9 @@ end; $$ language plpgsql; -- should fail select excpt_test2(); ERROR: column "sqlstate" does not exist -LINE 1: SELECT sqlstate - ^ -QUERY: SELECT sqlstate +LINE 1: sqlstate + ^ +QUERY: sqlstate CONTEXT: PL/pgSQL function excpt_test2() line 5 at RAISE create function excpt_test3() returns void as $$ begin @@ -4467,11 +4467,11 @@ end $$; select fail(); ERROR: division by zero -CONTEXT: SQL statement "SELECT 1/0" +CONTEXT: SQL expression "1/0" PL/pgSQL function fail() line 3 at RETURN select fail(); ERROR: division by zero -CONTEXT: SQL statement "SELECT 1/0" +CONTEXT: SQL expression "1/0" PL/pgSQL function fail() line 3 at RETURN drop function fail(); -- Test handling of string literals. @@ -4497,10 +4497,10 @@ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. select strtest(); NOTICE: foo\bar!baz WARNING: nonstandard use of \\ in a string literal -LINE 1: SELECT 'foo\\bar\041baz' - ^ +LINE 1: 'foo\\bar\041baz' + ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. -QUERY: SELECT 'foo\\bar\041baz' +QUERY: 'foo\\bar\041baz' strtest ------------- foo\bar!baz @@ -5621,9 +5621,9 @@ ALTER TABLE alter_table_under_transition_tables UPDATE alter_table_under_transition_tables SET id = id; ERROR: column "name" does not exist -LINE 1: SELECT (SELECT string_agg(id || '=' || name, ',') FROM d) - ^ -QUERY: SELECT (SELECT string_agg(id || '=' || name, ',') FROM d) +LINE 1: (SELECT string_agg(id || '=' || name, ',') FROM d) + ^ +QUERY: (SELECT string_agg(id || '=' || name, ',') FROM d) CONTEXT: PL/pgSQL function alter_table_under_transition_tables_upd_func() line 3 at RAISE -- -- Test multiple reference to a transition table diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c index c3ce480c8f..edea27697e 100644 --- a/src/backend/commands/functioncmds.c +++ b/src/backend/commands/functioncmds.c @@ -1628,6 +1628,7 @@ CreateCast(CreateCastStmt *stmt) case COERCION_ASSIGNMENT: castcontext = COERCION_CODE_ASSIGNMENT; break; + /* COERCION_PLPGSQL is intentionally not covered here */ case COERCION_EXPLICIT: castcontext = COERCION_CODE_EXPLICIT; break; diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index f529707458..eb0d9f51fb 100644 --- a/src/backend/executor/spi.c +++ b/src/backend/executor/spi.c @@ -2769,6 +2769,11 @@ _SPI_error_callback(void *arg) case RAW_PARSE_PLPGSQL_EXPR: errcontext("SQL expression \"%s\"", query); break; + case RAW_PARSE_PLPGSQL_ASSIGN1: + case RAW_PARSE_PLPGSQL_ASSIGN2: + case RAW_PARSE_PLPGSQL_ASSIGN3: + errcontext("PL/pgSQL assignment \"%s\"", query); + break; default: errcontext("SQL statement \"%s\"", query); break; diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 70f8b718e0..990daf1c98 100644 --- a/src/backend/nodes/copyfuncs.c +++ b/src/backend/nodes/copyfuncs.c @@ -3199,6 +3199,20 @@ _copySetOperationStmt(const SetOperationStmt *from) return newnode; } +static PLAssignStmt * +_copyPLAssignStmt(const PLAssignStmt *from) +{ + PLAssignStmt *newnode = makeNode(PLAssignStmt); + + COPY_STRING_FIELD(name); + COPY_NODE_FIELD(indirection); + COPY_SCALAR_FIELD(nnames); + COPY_NODE_FIELD(val); + COPY_LOCATION_FIELD(location); + + return newnode; +} + static AlterTableStmt * _copyAlterTableStmt(const AlterTableStmt *from) { @@ -5220,6 +5234,9 @@ copyObjectImpl(const void *from) case T_SetOperationStmt: retval = _copySetOperationStmt(from); break; + case T_PLAssignStmt: + retval = _copyPLAssignStmt(from); + break; case T_AlterTableStmt: retval = _copyAlterTableStmt(from); break; diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c index 541e0e6b48..7cbd31273d 100644 --- a/src/backend/nodes/equalfuncs.c +++ b/src/backend/nodes/equalfuncs.c @@ -1085,6 +1085,18 @@ _equalSetOperationStmt(const SetOperationStmt *a, const SetOperationStmt *b) return true; } +static bool +_equalPLAssignStmt(const PLAssignStmt *a, const PLAssignStmt *b) +{ + COMPARE_STRING_FIELD(name); + COMPARE_NODE_FIELD(indirection); + COMPARE_SCALAR_FIELD(nnames); + COMPARE_NODE_FIELD(val); + COMPARE_LOCATION_FIELD(location); + + return true; +} + static bool _equalAlterTableStmt(const AlterTableStmt *a, const AlterTableStmt *b) { @@ -3275,6 +3287,9 @@ equal(const void *a, const void *b) case T_SetOperationStmt: retval = _equalSetOperationStmt(a, b); break; + case T_PLAssignStmt: + retval = _equalPLAssignStmt(a, b); + break; case T_AlterTableStmt: retval = _equalAlterTableStmt(a, b); break; diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index 963f71e99d..f66d0e5fd5 100644 --- a/src/backend/nodes/nodeFuncs.c +++ b/src/backend/nodes/nodeFuncs.c @@ -3669,6 +3669,16 @@ raw_expression_tree_walker(Node *node, return true; } break; + case T_PLAssignStmt: + { + PLAssignStmt *stmt = (PLAssignStmt *) node; + + if (walker(stmt->indirection, context)) + return true; + if (walker(stmt->val, context)) + return true; + } + break; case T_A_Expr: { A_Expr *expr = (A_Expr *) node; diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index d78b16ed1d..4871702f02 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -2775,6 +2775,18 @@ _outSelectStmt(StringInfo str, const SelectStmt *node) WRITE_NODE_FIELD(rarg); } +static void +_outPLAssignStmt(StringInfo str, const PLAssignStmt *node) +{ + WRITE_NODE_TYPE("PLASSIGN"); + + WRITE_STRING_FIELD(name); + WRITE_NODE_FIELD(indirection); + WRITE_INT_FIELD(nnames); + WRITE_NODE_FIELD(val); + WRITE_LOCATION_FIELD(location); +} + static void _outFuncCall(StringInfo str, const FuncCall *node) { @@ -4211,6 +4223,9 @@ outNode(StringInfo str, const void *obj) case T_SelectStmt: _outSelectStmt(str, obj); break; + case T_PLAssignStmt: + _outPLAssignStmt(str, obj); + break; case T_ColumnDef: _outColumnDef(str, obj); break; diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index 084e00f73d..bf2824c00f 100644 --- a/src/backend/parser/analyze.c +++ b/src/backend/parser/analyze.c @@ -42,8 +42,10 @@ #include "parser/parse_param.h" #include "parser/parse_relation.h" #include "parser/parse_target.h" +#include "parser/parse_type.h" #include "parser/parsetree.h" #include "rewrite/rewriteManip.h" +#include "utils/builtins.h" #include "utils/rel.h" @@ -70,6 +72,8 @@ static Query *transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt); static List *transformReturningList(ParseState *pstate, List *returningList); static List *transformUpdateTargetList(ParseState *pstate, List *targetList); +static Query *transformPLAssignStmt(ParseState *pstate, + PLAssignStmt *stmt); static Query *transformDeclareCursorStmt(ParseState *pstate, DeclareCursorStmt *stmt); static Query *transformExplainStmt(ParseState *pstate, @@ -304,6 +308,11 @@ transformStmt(ParseState *pstate, Node *parseTree) } break; + case T_PLAssignStmt: + result = transformPLAssignStmt(pstate, + (PLAssignStmt *) parseTree); + break; + /* * Special cases */ @@ -367,6 +376,7 @@ analyze_requires_snapshot(RawStmt *parseTree) case T_DeleteStmt: case T_UpdateStmt: case T_SelectStmt: + case T_PLAssignStmt: result = true; break; @@ -2393,6 +2403,236 @@ transformReturningList(ParseState *pstate, List *returningList) } +/* + * transformPLAssignStmt - + * transform a PL/pgSQL assignment statement + * + * If there is no opt_indirection, the transformed statement looks like + * "SELECT a_expr ...", except the expression has been cast to the type of + * the target. With indirection, it's still a SELECT, but the expression will + * incorporate FieldStore and/or assignment SubscriptingRef nodes to compute a + * new value for a container-type variable represented by the target. The + * expression references the target as the container source. + */ +static Query * +transformPLAssignStmt(ParseState *pstate, PLAssignStmt *stmt) +{ + Query *qry = makeNode(Query); + ColumnRef *cref = makeNode(ColumnRef); + List *indirection = stmt->indirection; + int nnames = stmt->nnames; + SelectStmt *sstmt = stmt->val; + Node *target; + Oid targettype; + int32 targettypmod; + Oid targetcollation; + List *tlist; + TargetEntry *tle; + Oid type_id; + Node *qual; + ListCell *l; + + /* + * First, construct a ColumnRef for the target variable. If the target + * has more than one dotted name, we have to pull the extra names out of + * the indirection list. + */ + cref->fields = list_make1(makeString(stmt->name)); + cref->location = stmt->location; + if (nnames > 1) + { + /* avoid munging the raw parsetree */ + indirection = list_copy(indirection); + while (--nnames > 0 && indirection != NIL) + { + Node *ind = (Node *) linitial(indirection); + + if (!IsA(ind, String)) + elog(ERROR, "invalid name count in PLAssignStmt"); + cref->fields = lappend(cref->fields, ind); + indirection = list_delete_first(indirection); + } + } + + /* + * Transform the target reference. Typically we will get back a Param + * node, but there's no reason to be too picky about its type. + */ + target = transformExpr(pstate, (Node *) cref, + EXPR_KIND_UPDATE_TARGET); + targettype = exprType(target); + targettypmod = exprTypmod(target); + targetcollation = exprCollation(target); + + /* + * The rest mostly matches transformSelectStmt, except that we needn't + * consider WITH or DISTINCT, and we build a targetlist our own way. + */ + qry->commandType = CMD_SELECT; + pstate->p_is_insert = false; + + /* make FOR UPDATE/FOR SHARE info available to addRangeTableEntry */ + pstate->p_locking_clause = sstmt->lockingClause; + + /* make WINDOW info available for window functions, too */ + pstate->p_windowdefs = sstmt->windowClause; + + /* process the FROM clause */ + transformFromClause(pstate, sstmt->fromClause); + + /* initially transform the targetlist as if in SELECT */ + tlist = transformTargetList(pstate, sstmt->targetList, + EXPR_KIND_SELECT_TARGET); + + /* we should have exactly one targetlist item */ + if (list_length(tlist) != 1) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg_plural("assignment source returned %d column", + "assignment source returned %d columns", + list_length(tlist), + list_length(tlist)))); + + tle = linitial_node(TargetEntry, tlist); + + /* + * This next bit is similar to transformAssignedExpr; the key difference + * is we use COERCION_PLPGSQL not COERCION_ASSIGNMENT. + */ + type_id = exprType((Node *) tle->expr); + + pstate->p_expr_kind = EXPR_KIND_UPDATE_TARGET; + + if (indirection) + { + tle->expr = (Expr *) + transformAssignmentIndirection(pstate, + target, + stmt->name, + false, + targettype, + targettypmod, + targetcollation, + indirection, + list_head(indirection), + (Node *) tle->expr, + COERCION_PLPGSQL, + exprLocation(target)); + } + else if (targettype != type_id && + (targettype == RECORDOID || ISCOMPLEX(targettype)) && + (type_id == RECORDOID || ISCOMPLEX(type_id))) + { + /* + * Hack: do not let coerce_to_target_type() deal with inconsistent + * composite types. Just pass the expression result through as-is, + * and let the PL/pgSQL executor do the conversion its way. This is + * rather bogus, but it's needed for backwards compatibility. + */ + } + else + { + /* + * For normal non-qualified target column, do type checking and + * coercion. + */ + Node *orig_expr = (Node *) tle->expr; + + tle->expr = (Expr *) + coerce_to_target_type(pstate, + orig_expr, type_id, + targettype, targettypmod, + COERCION_PLPGSQL, + COERCE_IMPLICIT_CAST, + -1); + /* With COERCION_PLPGSQL, this error is probably unreachable */ + if (tle->expr == NULL) + ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), + errmsg("variable \"%s\" is of type %s" + " but expression is of type %s", + stmt->name, + format_type_be(targettype), + format_type_be(type_id)), + errhint("You will need to rewrite or cast the expression."), + parser_errposition(pstate, exprLocation(orig_expr)))); + } + + pstate->p_expr_kind = EXPR_KIND_NONE; + + qry->targetList = list_make1(tle); + + /* transform WHERE */ + qual = transformWhereClause(pstate, sstmt->whereClause, + EXPR_KIND_WHERE, "WHERE"); + + /* initial processing of HAVING clause is much like WHERE clause */ + qry->havingQual = transformWhereClause(pstate, sstmt->havingClause, + EXPR_KIND_HAVING, "HAVING"); + + /* + * Transform sorting/grouping stuff. Do ORDER BY first because both + * transformGroupClause and transformDistinctClause need the results. Note + * that these functions can also change the targetList, so it's passed to + * them by reference. + */ + qry->sortClause = transformSortClause(pstate, + sstmt->sortClause, + &qry->targetList, + EXPR_KIND_ORDER_BY, + false /* allow SQL92 rules */ ); + + qry->groupClause = transformGroupClause(pstate, + sstmt->groupClause, + &qry->groupingSets, + &qry->targetList, + qry->sortClause, + EXPR_KIND_GROUP_BY, + false /* allow SQL92 rules */ ); + + /* No DISTINCT clause */ + Assert(!sstmt->distinctClause); + qry->distinctClause = NIL; + qry->hasDistinctOn = false; + + /* transform LIMIT */ + qry->limitOffset = transformLimitClause(pstate, sstmt->limitOffset, + EXPR_KIND_OFFSET, "OFFSET", + sstmt->limitOption); + qry->limitCount = transformLimitClause(pstate, sstmt->limitCount, + EXPR_KIND_LIMIT, "LIMIT", + sstmt->limitOption); + qry->limitOption = sstmt->limitOption; + + /* transform window clauses after we have seen all window functions */ + qry->windowClause = transformWindowDefinitions(pstate, + pstate->p_windowdefs, + &qry->targetList); + + qry->rtable = pstate->p_rtable; + qry->jointree = makeFromExpr(pstate->p_joinlist, qual); + + qry->hasSubLinks = pstate->p_hasSubLinks; + qry->hasWindowFuncs = pstate->p_hasWindowFuncs; + qry->hasTargetSRFs = pstate->p_hasTargetSRFs; + qry->hasAggs = pstate->p_hasAggs; + + foreach(l, sstmt->lockingClause) + { + transformLockingClause(pstate, qry, + (LockingClause *) lfirst(l), false); + } + + assign_query_collations(pstate, qry); + + /* this must be done after collations, for reliable comparison of exprs */ + if (pstate->p_hasAggs || qry->groupClause || qry->groupingSets || qry->havingQual) + parseCheckAggregates(pstate, qry); + + return qry; +} + + /* * transformDeclareCursorStmt - * transform a DECLARE CURSOR Statement diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 4c58b46651..9101435f7b 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -294,7 +294,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); %type <node> select_no_parens select_with_parens select_clause simple_select values_clause - PLpgSQL_Expr + PLpgSQL_Expr PLAssignStmt %type <node> alter_column_default opclass_item opclass_drop alter_using %type <ival> add_drop opt_asc_desc opt_nulls_order @@ -536,7 +536,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); %type <str> ColId ColLabel BareColLabel %type <str> NonReservedWord NonReservedWord_or_Sconst %type <str> var_name type_function_name param_name -%type <str> createdb_opt_name +%type <str> createdb_opt_name plassign_target %type <node> var_value zone_value %type <rolespec> auth_ident RoleSpec opt_granted_by @@ -733,6 +733,9 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); */ %token MODE_TYPE_NAME %token MODE_PLPGSQL_EXPR +%token MODE_PLPGSQL_ASSIGN1 +%token MODE_PLPGSQL_ASSIGN2 +%token MODE_PLPGSQL_ASSIGN3 /* Precedence: lowest to highest */ @@ -815,6 +818,27 @@ stmtblock: stmtmulti pg_yyget_extra(yyscanner)->parsetree = list_make1(makeRawStmt($2, 0)); } + | MODE_PLPGSQL_ASSIGN1 PLAssignStmt + { + PLAssignStmt *n = (PLAssignStmt *) $2; + n->nnames = 1; + pg_yyget_extra(yyscanner)->parsetree = + list_make1(makeRawStmt((Node *) n, 0)); + } + | MODE_PLPGSQL_ASSIGN2 PLAssignStmt + { + PLAssignStmt *n = (PLAssignStmt *) $2; + n->nnames = 2; + pg_yyget_extra(yyscanner)->parsetree = + list_make1(makeRawStmt((Node *) n, 0)); + } + | MODE_PLPGSQL_ASSIGN3 PLAssignStmt + { + PLAssignStmt *n = (PLAssignStmt *) $2; + n->nnames = 3; + pg_yyget_extra(yyscanner)->parsetree = + list_make1(makeRawStmt((Node *) n, 0)); + } ; /* @@ -15069,6 +15093,31 @@ PLpgSQL_Expr: opt_target_list } ; +/* + * PL/pgSQL Assignment statement: name opt_indirection := PLpgSQL_Expr + */ + +PLAssignStmt: plassign_target opt_indirection plassign_equals PLpgSQL_Expr + { + PLAssignStmt *n = makeNode(PLAssignStmt); + + n->name = $1; + n->indirection = check_indirection($2, yyscanner); + /* nnames will be filled by calling production */ + n->val = (SelectStmt *) $4; + n->location = @1; + $$ = (Node *) n; + } + ; + +plassign_target: ColId { $$ = $1; } + | PARAM { $$ = psprintf("$%d", $1); } + ; + +plassign_equals: COLON_EQUALS + | '=' + ; + /* * Name classification hierarchy. diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c index da6c3ae4b5..462e2af74e 100644 --- a/src/backend/parser/parse_coerce.c +++ b/src/backend/parser/parse_coerce.c @@ -2815,6 +2815,14 @@ find_coercion_pathway(Oid targetTypeId, Oid sourceTypeId, } } + /* + * When parsing PL/pgSQL assignments, allow an I/O cast to be used + * whenever no normal coercion is available. + */ + if (result == COERCION_PATH_NONE && + ccontext == COERCION_PLPGSQL) + result = COERCION_PATH_COERCEVIAIO; + return result; } diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c index 3dda8e2847..8f2b0f18e0 100644 --- a/src/backend/parser/parse_target.c +++ b/src/backend/parser/parse_target.c @@ -34,17 +34,6 @@ static void markTargetListOrigin(ParseState *pstate, TargetEntry *tle, Var *var, int levelsup); -static Node *transformAssignmentIndirection(ParseState *pstate, - Node *basenode, - const char *targetName, - bool targetIsSubscripting, - Oid targetTypeId, - int32 targetTypMod, - Oid targetCollation, - List *indirection, - ListCell *indirection_cell, - Node *rhs, - int location); static Node *transformAssignmentSubscripts(ParseState *pstate, Node *basenode, const char *targetName, @@ -56,6 +45,7 @@ static Node *transformAssignmentSubscripts(ParseState *pstate, List *indirection, ListCell *next_indirection, Node *rhs, + CoercionContext ccontext, int location); static List *ExpandColumnRefStar(ParseState *pstate, ColumnRef *cref, bool make_target_entry); @@ -561,6 +551,7 @@ transformAssignedExpr(ParseState *pstate, indirection, list_head(indirection), (Node *) expr, + COERCION_ASSIGNMENT, location); } else @@ -642,15 +633,15 @@ updateTargetListEntry(ParseState *pstate, /* * Process indirection (field selection or subscripting) of the target - * column in INSERT/UPDATE. This routine recurses for multiple levels - * of indirection --- but note that several adjacent A_Indices nodes in - * the indirection list are treated as a single multidimensional subscript + * column in INSERT/UPDATE/assignment. This routine recurses for multiple + * levels of indirection --- but note that several adjacent A_Indices nodes + * in the indirection list are treated as a single multidimensional subscript * operation. * * In the initial call, basenode is a Var for the target column in UPDATE, - * or a null Const of the target's type in INSERT. In recursive calls, - * basenode is NULL, indicating that a substitute node should be consed up if - * needed. + * or a null Const of the target's type in INSERT, or a Param for the target + * variable in PL/pgSQL assignment. In recursive calls, basenode is NULL, + * indicating that a substitute node should be consed up if needed. * * targetName is the name of the field or subfield we're assigning to, and * targetIsSubscripting is true if we're subscripting it. These are just for @@ -667,12 +658,16 @@ updateTargetListEntry(ParseState *pstate, * rhs is the already-transformed value to be assigned; note it has not been * coerced to any particular type. * + * ccontext is the coercion level to use while coercing the rhs. For + * normal statements it'll be COERCION_ASSIGNMENT, but PL/pgSQL uses + * a special value. + * * location is the cursor error position for any errors. (Note: this points * to the head of the target clause, eg "foo" in "foo.bar[baz]". Later we * might want to decorate indirection cells with their own location info, * in which case the location argument could probably be dropped.) */ -static Node * +Node * transformAssignmentIndirection(ParseState *pstate, Node *basenode, const char *targetName, @@ -683,6 +678,7 @@ transformAssignmentIndirection(ParseState *pstate, List *indirection, ListCell *indirection_cell, Node *rhs, + CoercionContext ccontext, int location) { Node *result; @@ -757,6 +753,7 @@ transformAssignmentIndirection(ParseState *pstate, indirection, i, rhs, + ccontext, location); } @@ -807,6 +804,7 @@ transformAssignmentIndirection(ParseState *pstate, indirection, lnext(indirection, i), rhs, + ccontext, location); /* and build a FieldStore node */ @@ -845,6 +843,7 @@ transformAssignmentIndirection(ParseState *pstate, indirection, NULL, rhs, + ccontext, location); } @@ -853,7 +852,7 @@ transformAssignmentIndirection(ParseState *pstate, result = coerce_to_target_type(pstate, rhs, exprType(rhs), targetTypeId, targetTypMod, - COERCION_ASSIGNMENT, + ccontext, COERCE_IMPLICIT_CAST, -1); if (result == NULL) @@ -898,6 +897,7 @@ transformAssignmentSubscripts(ParseState *pstate, List *indirection, ListCell *next_indirection, Node *rhs, + CoercionContext ccontext, int location) { Node *result; @@ -949,6 +949,7 @@ transformAssignmentSubscripts(ParseState *pstate, indirection, next_indirection, rhs, + ccontext, location); /* @@ -969,7 +970,7 @@ transformAssignmentSubscripts(ParseState *pstate, result = coerce_to_target_type(pstate, result, resulttype, targetTypeId, targetTypMod, - COERCION_ASSIGNMENT, + ccontext, COERCE_IMPLICIT_CAST, -1); /* can fail if we had int2vector/oidvector, but not for true domains */ diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c index 71df8ef022..291706c593 100644 --- a/src/backend/parser/parser.c +++ b/src/backend/parser/parser.c @@ -58,7 +58,10 @@ raw_parser(const char *str, RawParseMode mode) static const int mode_token[] = { 0, /* RAW_PARSE_DEFAULT */ MODE_TYPE_NAME, /* RAW_PARSE_TYPE_NAME */ - MODE_PLPGSQL_EXPR /* RAW_PARSE_PLPGSQL_EXPR */ + MODE_PLPGSQL_EXPR, /* RAW_PARSE_PLPGSQL_EXPR */ + MODE_PLPGSQL_ASSIGN1, /* RAW_PARSE_PLPGSQL_ASSIGN1 */ + MODE_PLPGSQL_ASSIGN2, /* RAW_PARSE_PLPGSQL_ASSIGN2 */ + MODE_PLPGSQL_ASSIGN3 /* RAW_PARSE_PLPGSQL_ASSIGN3 */ }; yyextra.have_lookahead = true; diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c index a42ead7d69..fdcabe6a48 100644 --- a/src/backend/tcop/utility.c +++ b/src/backend/tcop/utility.c @@ -2313,6 +2313,10 @@ CreateCommandTag(Node *parsetree) tag = CMDTAG_SELECT; break; + case T_PLAssignStmt: + tag = CMDTAG_SELECT; + break; + /* utility statements --- same whether raw or cooked */ case T_TransactionStmt: { @@ -3181,6 +3185,10 @@ GetCommandLogLevel(Node *parsetree) lev = LOGSTMT_ALL; break; + case T_PLAssignStmt: + lev = LOGSTMT_ALL; + break; + /* utility statements --- same whether raw or cooked */ case T_TransactionStmt: lev = LOGSTMT_ALL; diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h index 3684f87a88..2d445d03db 100644 --- a/src/include/nodes/nodes.h +++ b/src/include/nodes/nodes.h @@ -314,6 +314,7 @@ typedef enum NodeTag T_DeleteStmt, T_UpdateStmt, T_SelectStmt, + T_PLAssignStmt, T_AlterTableStmt, T_AlterTableCmd, T_AlterDomainStmt, diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index 48a79a7657..8e09a457ab 100644 --- a/src/include/nodes/parsenodes.h +++ b/src/include/nodes/parsenodes.h @@ -1675,6 +1675,25 @@ typedef struct SetOperationStmt } SetOperationStmt; +/* ---------------------- + * PL/pgSQL Assignment Statement + * + * Like SelectStmt, this is transformed into a SELECT Query. + * However, the targetlist of the result looks more like an UPDATE. + * ---------------------- + */ +typedef struct PLAssignStmt +{ + NodeTag type; + + char *name; /* initial column name */ + List *indirection; /* subscripts and field names, if any */ + int nnames; /* number of names to use in ColumnRef */ + SelectStmt *val; /* the PL/pgSQL expression to assign */ + int location; /* name's token location, or -1 if unknown */ +} PLAssignStmt; + + /***************************************************************************** * Other Statements (no optimizations required) * diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h index dd85908fe2..71a58804dc 100644 --- a/src/include/nodes/primnodes.h +++ b/src/include/nodes/primnodes.h @@ -457,6 +457,7 @@ typedef enum CoercionContext { COERCION_IMPLICIT, /* coercion in context of expression */ COERCION_ASSIGNMENT, /* coercion in context of assignment */ + COERCION_PLPGSQL, /* if no assignment cast, use CoerceViaIO */ COERCION_EXPLICIT /* explicit cast operation */ } CoercionContext; diff --git a/src/include/parser/parse_target.h b/src/include/parser/parse_target.h index 7039df29cb..5ce8c3666e 100644 --- a/src/include/parser/parse_target.h +++ b/src/include/parser/parse_target.h @@ -36,6 +36,18 @@ extern void updateTargetListEntry(ParseState *pstate, TargetEntry *tle, char *colname, int attrno, List *indirection, int location); +extern Node *transformAssignmentIndirection(ParseState *pstate, + Node *basenode, + const char *targetName, + bool targetIsSubscripting, + Oid targetTypeId, + int32 targetTypMod, + Oid targetCollation, + List *indirection, + ListCell *indirection_cell, + Node *rhs, + CoercionContext ccontext, + int location); extern List *checkInsertTargets(ParseState *pstate, List *cols, List **attrnos); extern TupleDesc expandRecordVariable(ParseState *pstate, Var *var, diff --git a/src/include/parser/parser.h b/src/include/parser/parser.h index 3b7dab17ef..ac89f2b93f 100644 --- a/src/include/parser/parser.h +++ b/src/include/parser/parser.h @@ -29,12 +29,19 @@ * * RAW_PARSE_PLPGSQL_EXPR: parse a PL/pgSQL expression, and return * a one-element List containing a RawStmt node. + * + * RAW_PARSE_PLPGSQL_ASSIGNn: parse a PL/pgSQL assignment statement, + * and return a one-element List containing a RawStmt node. "n" + * gives the number of dotted names comprising the target ColumnRef. */ typedef enum { RAW_PARSE_DEFAULT = 0, RAW_PARSE_TYPE_NAME, - RAW_PARSE_PLPGSQL_EXPR + RAW_PARSE_PLPGSQL_EXPR, + RAW_PARSE_PLPGSQL_ASSIGN1, + RAW_PARSE_PLPGSQL_ASSIGN2, + RAW_PARSE_PLPGSQL_ASSIGN3 } RawParseMode; /* Values for the backslash_quote GUC */ diff --git a/src/interfaces/ecpg/preproc/parse.pl b/src/interfaces/ecpg/preproc/parse.pl index 2b254c2b77..cea6dd6517 100644 --- a/src/interfaces/ecpg/preproc/parse.pl +++ b/src/interfaces/ecpg/preproc/parse.pl @@ -71,7 +71,10 @@ my %replace_types = ( 'type_function_name' => 'ignore', 'ColLabel' => 'ignore', 'Sconst' => 'ignore', - 'PLpgSQL_Expr' => 'ignore',); + 'PLpgSQL_Expr' => 'ignore', + 'PLAssignStmt' => 'ignore', + 'plassign_target' => 'ignore', + 'plassign_equals' => 'ignore',); # these replace_line commands excise certain keywords from the core keyword # lists. Be sure to account for these in ColLabel and related productions. diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index f56dcd0e79..cb5c7f9fea 100644 --- a/src/pl/plpgsql/src/pl_exec.c +++ b/src/pl/plpgsql/src/pl_exec.c @@ -8008,10 +8008,14 @@ get_cast_hashentry(PLpgSQL_execstate *estate, placeholder->collation = get_typcollation(srctype); /* - * Apply coercion. We use ASSIGNMENT coercion because that's the - * closest match to plpgsql's historical behavior; in particular, - * EXPLICIT coercion would allow silent truncation to a destination - * varchar/bpchar's length, which we do not want. + * Apply coercion. We use the special coercion context + * COERCION_PLPGSQL to match plpgsql's historical behavior, namely + * that any cast not available at ASSIGNMENT level will be implemented + * as an I/O coercion. (It's somewhat dubious that we prefer I/O + * coercion over cast pathways that exist at EXPLICIT level. Changing + * that would cause assorted minor behavioral differences though, and + * a user who wants the explicit-cast behavior can always write an + * explicit cast.) * * If source type is UNKNOWN, coerce_to_target_type will fail (it only * expects to see that for Const input nodes), so don't call it; we'll @@ -8024,7 +8028,7 @@ get_cast_hashentry(PLpgSQL_execstate *estate, cast_expr = coerce_to_target_type(NULL, (Node *) placeholder, srctype, dsttype, dsttypmod, - COERCION_ASSIGNMENT, + COERCION_PLPGSQL, COERCE_IMPLICIT_CAST, -1); @@ -8032,7 +8036,8 @@ get_cast_hashentry(PLpgSQL_execstate *estate, * If there's no cast path according to the parser, fall back to using * an I/O coercion; this is semantically dubious but matches plpgsql's * historical behavior. We would need something of the sort for - * UNKNOWN literals in any case. + * UNKNOWN literals in any case. (This is probably now only reachable + * in the case where srctype is UNKNOWN/RECORD.) */ if (cast_expr == NULL) { @@ -8341,7 +8346,8 @@ exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno) return; /* - * Top level of expression must be a simple FuncExpr or OpExpr. + * Top level of expression must be a simple FuncExpr, OpExpr, or + * SubscriptingRef. */ if (IsA(expr->expr_simple_expr, FuncExpr)) { @@ -8357,6 +8363,33 @@ exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno) funcid = opexpr->opfuncid; fargs = opexpr->args; } + else if (IsA(expr->expr_simple_expr, SubscriptingRef)) + { + SubscriptingRef *sbsref = (SubscriptingRef *) expr->expr_simple_expr; + + /* We only trust standard varlena arrays to be safe */ + if (get_typsubscript(sbsref->refcontainertype, NULL) != + F_ARRAY_SUBSCRIPT_HANDLER) + return; + + /* refexpr can be a simple Param, otherwise must not contain target */ + if (!(sbsref->refexpr && IsA(sbsref->refexpr, Param)) && + contains_target_param((Node *) sbsref->refexpr, &target_dno)) + return; + + /* the other subexpressions must not contain target */ + if (contains_target_param((Node *) sbsref->refupperindexpr, + &target_dno) || + contains_target_param((Node *) sbsref->reflowerindexpr, + &target_dno) || + contains_target_param((Node *) sbsref->refassgnexpr, + &target_dno)) + return; + + /* OK, we can pass target as a read-write parameter */ + expr->rwparam = target_dno; + return; + } else return; diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y index 3b36220d73..051544a3b4 100644 --- a/src/pl/plpgsql/src/pl_gram.y +++ b/src/pl/plpgsql/src/pl_gram.y @@ -973,16 +973,40 @@ stmt_call : K_CALL } ; -stmt_assign : assign_var assign_operator expr_until_semi +stmt_assign : T_DATUM { PLpgSQL_stmt_assign *new; + RawParseMode pmode; + /* see how many names identify the datum */ + switch ($1.ident ? 1 : list_length($1.idents)) + { + case 1: + pmode = RAW_PARSE_PLPGSQL_ASSIGN1; + break; + case 2: + pmode = RAW_PARSE_PLPGSQL_ASSIGN2; + break; + case 3: + pmode = RAW_PARSE_PLPGSQL_ASSIGN3; + break; + default: + elog(ERROR, "unexpected number of names"); + pmode = 0; /* keep compiler quiet */ + } + + check_assignable($1.datum, @1); new = palloc0(sizeof(PLpgSQL_stmt_assign)); new->cmd_type = PLPGSQL_STMT_ASSIGN; new->lineno = plpgsql_location_to_lineno(@1); new->stmtid = ++plpgsql_curr_compile->nstatements; - new->varno = $1->dno; - new->expr = $3; + new->varno = $1.datum->dno; + /* Push back the head name to include it in the stmt */ + plpgsql_push_back_token(T_DATUM); + new->expr = read_sql_construct(';', 0, 0, ";", + pmode, + false, true, true, + NULL, NULL); $$ = (PLpgSQL_stmt *)new; } diff --git a/contrib/hstore/expected/hstore.out b/contrib/hstore/expected/hstore.out index fdcc3920ce..64a3272b9c 100644 --- a/contrib/hstore/expected/hstore.out +++ b/contrib/hstore/expected/hstore.out @@ -1583,6 +1583,10 @@ select f2 from test_json_agg; "d"=>NULL, "x"=>"xyzzy" (3 rows) +-- Test subscripting in plpgsql +do $$ declare h hstore; +begin h['a'] := 'b'; raise notice 'h = %, h[a] = %', h, h['a']; end $$; +NOTICE: h = "a"=>"b", h[a] = b -- Check the hstore_hash() and hstore_hash_extended() function explicitly. SELECT v as value, hstore_hash(v)::bit(32) as standard, hstore_hash_extended(v, 0)::bit(32) as extended0, diff --git a/contrib/hstore/sql/hstore.sql b/contrib/hstore/sql/hstore.sql index 8d96e30403..a59db66b0a 100644 --- a/contrib/hstore/sql/hstore.sql +++ b/contrib/hstore/sql/hstore.sql @@ -372,6 +372,10 @@ select f2['d':'e'] from test_json_agg; -- error update test_json_agg set f2['d'] = f2['e'], f2['x'] = 'xyzzy'; select f2 from test_json_agg; +-- Test subscripting in plpgsql +do $$ declare h hstore; +begin h['a'] := 'b'; raise notice 'h = %, h[a] = %', h, h['a']; end $$; + -- Check the hstore_hash() and hstore_hash_extended() function explicitly. SELECT v as value, hstore_hash(v)::bit(32) as standard, hstore_hash_extended(v, 0)::bit(32) as extended0, diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml index 9ec168b0c4..32c466eaa5 100644 --- a/doc/src/sgml/plpgsql.sgml +++ b/doc/src/sgml/plpgsql.sgml @@ -946,8 +946,8 @@ PREPARE <replaceable>statement_name</replaceable>(integer, integer) AS SELECT $1 database engine. The expression must yield a single value (possibly a row value, if the variable is a row or record variable). The target variable can be a simple variable (optionally qualified with a block - name), a field of a row or record variable, or an element of an array - that is a simple variable or field. Equal (<literal>=</literal>) can be + name), a field of a row or record target, or an element or slice of + an array target. Equal (<literal>=</literal>) can be used instead of PL/SQL-compliant <literal>:=</literal>. </para> @@ -968,8 +968,25 @@ PREPARE <replaceable>statement_name</replaceable>(integer, integer) AS SELECT $1 <programlisting> tax := subtotal * 0.06; my_record.user_id := 20; +my_array[j] := 20; +my_array[1:3] := array[1,2,3]; +complex_array[n].realpart = 12.3; </programlisting> </para> + + <para> + It's useful to know that what follows the assignment operator is + essentially treated as a <literal>SELECT</literal> command; as long + as it returns a single row and column, it will work. Thus for example + one can write something like +<programlisting> +total_sales := sum(quantity) from sales; +</programlisting> + This provides an effect similar to the single-row <literal>SELECT + ... INTO</literal> syntax described in + <xref linkend="plpgsql-statements-sql-onerow"/>. However, that syntax + is more portable. + </para> </sect2> <sect2 id="plpgsql-statements-sql-noresult"> diff --git a/src/pl/plpgsql/src/Makefile b/src/pl/plpgsql/src/Makefile index 193df8a010..9946abbc1d 100644 --- a/src/pl/plpgsql/src/Makefile +++ b/src/pl/plpgsql/src/Makefile @@ -32,7 +32,7 @@ DATA = plpgsql.control plpgsql--1.0.sql REGRESS_OPTS = --dbname=$(PL_TESTDB) -REGRESS = plpgsql_call plpgsql_control plpgsql_copy plpgsql_domain \ +REGRESS = plpgsql_array plpgsql_call plpgsql_control plpgsql_copy plpgsql_domain \ plpgsql_record plpgsql_cache plpgsql_simple plpgsql_transaction \ plpgsql_trap plpgsql_trigger plpgsql_varprops diff --git a/src/pl/plpgsql/src/expected/plpgsql_array.out b/src/pl/plpgsql/src/expected/plpgsql_array.out new file mode 100644 index 0000000000..5f28b4f685 --- /dev/null +++ b/src/pl/plpgsql/src/expected/plpgsql_array.out @@ -0,0 +1,94 @@ +-- +-- Tests for PL/pgSQL handling of array variables +-- +-- We also check arrays of composites here, so this has some overlap +-- with the plpgsql_record tests. +-- +create type complex as (r float8, i float8); +create type quadarray as (c1 complex[], c2 complex); +do $$ declare a int[]; +begin a := array[1,2]; a[3] := 4; raise notice 'a = %', a; end$$; +NOTICE: a = {1,2,4} +do $$ declare a int[]; +begin a[3] := 4; raise notice 'a = %', a; end$$; +NOTICE: a = [3:3]={4} +do $$ declare a int[]; +begin a[1][4] := 4; raise notice 'a = %', a; end$$; +NOTICE: a = [1:1][4:4]={{4}} +do $$ declare a int[]; +begin a[1] := 23::text; raise notice 'a = %', a; end$$; -- lax typing +NOTICE: a = {23} +do $$ declare a int[]; +begin a := array[1,2]; a[2:3] := array[3,4]; raise notice 'a = %', a; end$$; +NOTICE: a = {1,3,4} +do $$ declare a int[]; +begin a := array[1,2]; a[2] := a[2] + 1; raise notice 'a = %', a; end$$; +NOTICE: a = {1,3} +do $$ declare a int[]; +begin a[1:2] := array[3,4]; raise notice 'a = %', a; end$$; +NOTICE: a = {3,4} +do $$ declare a int[]; +begin a[1:2] := 4; raise notice 'a = %', a; end$$; -- error +ERROR: malformed array literal: "4" +DETAIL: Array value must start with "{" or dimension information. +CONTEXT: PL/pgSQL function inline_code_block line 2 at assignment +do $$ declare a complex[]; +begin a[1] := (1,2); a[1].i := 11; raise notice 'a = %', a; end$$; +NOTICE: a = {"(1,11)"} +do $$ declare a complex[]; +begin a[1].i := 11; raise notice 'a = %, a[1].i = %', a, a[1].i; end$$; +NOTICE: a = {"(,11)"}, a[1].i = 11 +-- perhaps this ought to work, but for now it doesn't: +do $$ declare a complex[]; +begin a[1:2].i := array[11,12]; raise notice 'a = %', a; end$$; +ERROR: cannot assign to field "i" of column "a" because its type complex[] is not a composite type +LINE 1: a[1:2].i := array[11,12] + ^ +QUERY: a[1:2].i := array[11,12] +CONTEXT: PL/pgSQL function inline_code_block line 2 at assignment +do $$ declare a quadarray; +begin a.c1[1].i := 11; raise notice 'a = %, a.c1[1].i = %', a, a.c1[1].i; end$$; +NOTICE: a = ("{""(,11)""}",), a.c1[1].i = 11 +do $$ declare a int[]; +begin a := array_agg(x) from (values(1),(2),(3)) v(x); raise notice 'a = %', a; end$$; +NOTICE: a = {1,2,3} +create temp table onecol as select array[1,2] as f1; +do $$ declare a int[]; +begin a := f1 from onecol; raise notice 'a = %', a; end$$; +NOTICE: a = {1,2} +do $$ declare a int[]; +begin a := * from onecol for update; raise notice 'a = %', a; end$$; +NOTICE: a = {1,2} +-- error cases: +do $$ declare a int[]; +begin a := from onecol; raise notice 'a = %', a; end$$; +ERROR: assignment source returned 0 columns +CONTEXT: PL/pgSQL assignment "a := from onecol" +PL/pgSQL function inline_code_block line 2 at assignment +do $$ declare a int[]; +begin a := f1, f1 from onecol; raise notice 'a = %', a; end$$; +ERROR: assignment source returned 2 columns +CONTEXT: PL/pgSQL assignment "a := f1, f1 from onecol" +PL/pgSQL function inline_code_block line 2 at assignment +insert into onecol values(array[11]); +do $$ declare a int[]; +begin a := f1 from onecol; raise notice 'a = %', a; end$$; +ERROR: query "a := f1 from onecol" returned more than one row +CONTEXT: PL/pgSQL function inline_code_block line 2 at assignment +do $$ declare a int[]; +begin a := f1 from onecol limit 1; raise notice 'a = %', a; end$$; +NOTICE: a = {1,2} +do $$ declare a real; +begin a[1] := 2; raise notice 'a = %', a; end$$; +ERROR: cannot subscript type real because it does not support subscripting +LINE 1: a[1] := 2 + ^ +QUERY: a[1] := 2 +CONTEXT: PL/pgSQL function inline_code_block line 2 at assignment +do $$ declare a complex; +begin a.r[1] := 2; raise notice 'a = %', a; end$$; +ERROR: cannot subscript type double precision because it does not support subscripting +LINE 1: a.r[1] := 2 + ^ +QUERY: a.r[1] := 2 +CONTEXT: PL/pgSQL function inline_code_block line 2 at assignment diff --git a/src/pl/plpgsql/src/expected/plpgsql_record.out b/src/pl/plpgsql/src/expected/plpgsql_record.out index 52207e9b10..6e835c0751 100644 --- a/src/pl/plpgsql/src/expected/plpgsql_record.out +++ b/src/pl/plpgsql/src/expected/plpgsql_record.out @@ -3,6 +3,7 @@ -- create type two_int4s as (f1 int4, f2 int4); create type two_int8s as (q1 int8, q2 int8); +create type nested_int8s as (c1 two_int8s, c2 two_int8s); -- base-case return of a composite type create function retc(int) returns two_int8s language plpgsql as $$ begin return row($1,1)::two_int8s; end $$; @@ -82,6 +83,88 @@ begin end$$; NOTICE: c4 = (1,2) NOTICE: c8 = (1,2) +do $$ declare c two_int8s; d nested_int8s; +begin + c := row(1,2); + d := row(c, row(c.q1, c.q2+1)); + raise notice 'c = %, d = %', c, d; + c.q1 := 10; + d.c1 := row(11,12); + d.c2.q2 := 42; + raise notice 'c = %, d = %', c, d; + raise notice 'c.q1 = %, d.c2 = %', c.q1, d.c2; + raise notice '(d).c2.q2 = %', (d).c2.q2; -- doesn't work without parens + raise notice '(d.c2).q2 = %', (d.c2).q2; -- doesn't work without parens +end$$; +NOTICE: c = (1,2), d = ("(1,2)","(1,3)") +NOTICE: c = (10,2), d = ("(11,12)","(1,42)") +NOTICE: c.q1 = 10, d.c2 = (1,42) +NOTICE: (d).c2.q2 = 42 +NOTICE: (d.c2).q2 = 42 +-- block-qualified naming +do $$ <<b>> declare c two_int8s; d nested_int8s; +begin + b.c := row(1,2); + b.d := row(b.c, row(b.c.q1, b.c.q2+1)); + raise notice 'b.c = %, b.d = %', b.c, b.d; + b.c.q1 := 10; + b.d.c1 := row(11,12); + b.d.c2.q2 := 42; + raise notice 'b.c = %, b.d = %', b.c, b.d; + raise notice 'b.c.q1 = %, b.d.c2 = %', b.c.q1, b.d.c2; + raise notice '(b.d).c2.q2 = %', (b.d).c2.q2; -- doesn't work without parens + raise notice '(b.d.c2).q2 = %', (b.d.c2).q2; -- doesn't work without parens +end$$; +NOTICE: b.c = (1,2), b.d = ("(1,2)","(1,3)") +NOTICE: b.c = (10,2), b.d = ("(11,12)","(1,42)") +NOTICE: b.c.q1 = 10, b.d.c2 = (1,42) +NOTICE: (b.d).c2.q2 = 42 +NOTICE: (b.d.c2).q2 = 42 +-- error cases +do $$ declare c two_int8s; begin c.x = 1; end $$; +ERROR: record "c" has no field "x" +CONTEXT: PL/pgSQL assignment "c.x = 1" +PL/pgSQL function inline_code_block line 1 at assignment +do $$ declare c nested_int8s; begin c.x = 1; end $$; +ERROR: record "c" has no field "x" +CONTEXT: PL/pgSQL assignment "c.x = 1" +PL/pgSQL function inline_code_block line 1 at assignment +do $$ declare c nested_int8s; begin c.x.q1 = 1; end $$; +ERROR: record "c" has no field "x" +CONTEXT: PL/pgSQL assignment "c.x.q1 = 1" +PL/pgSQL function inline_code_block line 1 at assignment +do $$ declare c nested_int8s; begin c.c2.x = 1; end $$; +ERROR: cannot assign to field "x" of column "c" because there is no such column in data type two_int8s +LINE 1: c.c2.x = 1 + ^ +QUERY: c.c2.x = 1 +CONTEXT: PL/pgSQL function inline_code_block line 1 at assignment +do $$ declare c nested_int8s; begin d.c2.x = 1; end $$; +ERROR: "d.c2.x" is not a known variable +LINE 1: do $$ declare c nested_int8s; begin d.c2.x = 1; end $$; + ^ +do $$ <<b>> declare c two_int8s; begin b.c.x = 1; end $$; +ERROR: record "c" has no field "x" +CONTEXT: PL/pgSQL assignment "b.c.x = 1" +PL/pgSQL function inline_code_block line 1 at assignment +do $$ <<b>> declare c nested_int8s; begin b.c.x = 1; end $$; +ERROR: record "c" has no field "x" +CONTEXT: PL/pgSQL assignment "b.c.x = 1" +PL/pgSQL function inline_code_block line 1 at assignment +do $$ <<b>> declare c nested_int8s; begin b.c.x.q1 = 1; end $$; +ERROR: record "c" has no field "x" +CONTEXT: PL/pgSQL assignment "b.c.x.q1 = 1" +PL/pgSQL function inline_code_block line 1 at assignment +do $$ <<b>> declare c nested_int8s; begin b.c.c2.x = 1; end $$; +ERROR: cannot assign to field "x" of column "b" because there is no such column in data type two_int8s +LINE 1: b.c.c2.x = 1 + ^ +QUERY: b.c.c2.x = 1 +CONTEXT: PL/pgSQL function inline_code_block line 1 at assignment +do $$ <<b>> declare c nested_int8s; begin b.d.c2.x = 1; end $$; +ERROR: "b.d.c2" is not a known variable +LINE 1: do $$ <<b>> declare c nested_int8s; begin b.d.c2.x = 1; end ... + ^ -- check passing composite result to another function create function getq1(two_int8s) returns int8 language plpgsql as $$ declare r two_int8s; begin r := $1; return r.q1; end $$; diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c index b610b28d70..0225f5911d 100644 --- a/src/pl/plpgsql/src/pl_comp.c +++ b/src/pl/plpgsql/src/pl_comp.c @@ -1456,7 +1456,8 @@ plpgsql_parse_dblword(char *word1, char *word2, /* * We should do nothing in DECLARE sections. In SQL expressions, we * really only need to make sure that RECFIELD datums are created when - * needed. + * needed. In all the cases handled by this function, returning a T_DATUM + * with a two-word idents string is the right thing. */ if (plpgsql_IdentifierLookup != IDENTIFIER_LOOKUP_DECLARE) { @@ -1530,40 +1531,53 @@ plpgsql_parse_tripword(char *word1, char *word2, char *word3, List *idents; int nnames; - idents = list_make3(makeString(word1), - makeString(word2), - makeString(word3)); - /* - * We should do nothing in DECLARE sections. In SQL expressions, we - * really only need to make sure that RECFIELD datums are created when - * needed. + * We should do nothing in DECLARE sections. In SQL expressions, we need + * to make sure that RECFIELD datums are created when needed, and we need + * to be careful about how many names are reported as belonging to the + * T_DATUM: the third word could be a sub-field reference, which we don't + * care about here. */ if (plpgsql_IdentifierLookup != IDENTIFIER_LOOKUP_DECLARE) { /* - * Do a lookup in the current namespace stack. Must find a qualified + * Do a lookup in the current namespace stack. Must find a record * reference, else ignore. */ ns = plpgsql_ns_lookup(plpgsql_ns_top(), false, word1, word2, word3, &nnames); - if (ns != NULL && nnames == 2) + if (ns != NULL) { switch (ns->itemtype) { case PLPGSQL_NSTYPE_REC: { - /* - * words 1/2 are a record name, so third word could be - * a field in this record. - */ PLpgSQL_rec *rec; PLpgSQL_recfield *new; rec = (PLpgSQL_rec *) (plpgsql_Datums[ns->itemno]); - new = plpgsql_build_recfield(rec, word3); - + if (nnames == 1) + { + /* + * First word is a record name, so second word + * could be a field in this record (and the third, + * a sub-field). We build a RECFIELD datum + * whether it is or not --- any error will be + * detected later. + */ + new = plpgsql_build_recfield(rec, word2); + idents = list_make2(makeString(word1), + makeString(word2)); + } + else + { + /* Block-qualified reference to record variable. */ + new = plpgsql_build_recfield(rec, word3); + idents = list_make3(makeString(word1), + makeString(word2), + makeString(word3)); + } wdatum->datum = (PLpgSQL_datum *) new; wdatum->ident = NULL; wdatum->quoted = false; /* not used */ @@ -1578,6 +1592,9 @@ plpgsql_parse_tripword(char *word1, char *word2, char *word3, } /* Nothing found */ + idents = list_make3(makeString(word1), + makeString(word2), + makeString(word3)); cword->idents = idents; return false; } diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y index 051544a3b4..dece013e23 100644 --- a/src/pl/plpgsql/src/pl_gram.y +++ b/src/pl/plpgsql/src/pl_gram.y @@ -2004,7 +2004,7 @@ loop_body : proc_sect K_END K_LOOP opt_label ';' * variable. (The composite case is probably a syntax error, but we'll let * the core parser decide that.) Normally, we should assume that such a * word is a SQL statement keyword that isn't also a plpgsql keyword. - * However, if the next token is assignment or '[', it can't be a valid + * However, if the next token is assignment or '[' or '.', it can't be a valid * SQL statement, and what we're probably looking at is an intended variable * assignment. Give an appropriate complaint for that, instead of letting * the core parser throw an unhelpful "syntax error". @@ -2023,7 +2023,8 @@ stmt_execsql : K_IMPORT tok = yylex(); plpgsql_push_back_token(tok); - if (tok == '=' || tok == COLON_EQUALS || tok == '[') + if (tok == '=' || tok == COLON_EQUALS || + tok == '[' || tok == '.') word_is_not_variable(&($1), @1); $$ = make_execsql_stmt(T_WORD, @1); } @@ -2033,7 +2034,8 @@ stmt_execsql : K_IMPORT tok = yylex(); plpgsql_push_back_token(tok); - if (tok == '=' || tok == COLON_EQUALS || tok == '[') + if (tok == '=' || tok == COLON_EQUALS || + tok == '[' || tok == '.') cword_is_not_variable(&($1), @1); $$ = make_execsql_stmt(T_CWORD, @1); } diff --git a/src/pl/plpgsql/src/sql/plpgsql_array.sql b/src/pl/plpgsql/src/sql/plpgsql_array.sql new file mode 100644 index 0000000000..4c3f26be10 --- /dev/null +++ b/src/pl/plpgsql/src/sql/plpgsql_array.sql @@ -0,0 +1,79 @@ +-- +-- Tests for PL/pgSQL handling of array variables +-- +-- We also check arrays of composites here, so this has some overlap +-- with the plpgsql_record tests. +-- + +create type complex as (r float8, i float8); +create type quadarray as (c1 complex[], c2 complex); + +do $$ declare a int[]; +begin a := array[1,2]; a[3] := 4; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a[3] := 4; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a[1][4] := 4; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a[1] := 23::text; raise notice 'a = %', a; end$$; -- lax typing + +do $$ declare a int[]; +begin a := array[1,2]; a[2:3] := array[3,4]; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a := array[1,2]; a[2] := a[2] + 1; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a[1:2] := array[3,4]; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a[1:2] := 4; raise notice 'a = %', a; end$$; -- error + +do $$ declare a complex[]; +begin a[1] := (1,2); a[1].i := 11; raise notice 'a = %', a; end$$; + +do $$ declare a complex[]; +begin a[1].i := 11; raise notice 'a = %, a[1].i = %', a, a[1].i; end$$; + +-- perhaps this ought to work, but for now it doesn't: +do $$ declare a complex[]; +begin a[1:2].i := array[11,12]; raise notice 'a = %', a; end$$; + +do $$ declare a quadarray; +begin a.c1[1].i := 11; raise notice 'a = %, a.c1[1].i = %', a, a.c1[1].i; end$$; + +do $$ declare a int[]; +begin a := array_agg(x) from (values(1),(2),(3)) v(x); raise notice 'a = %', a; end$$; + +create temp table onecol as select array[1,2] as f1; + +do $$ declare a int[]; +begin a := f1 from onecol; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a := * from onecol for update; raise notice 'a = %', a; end$$; + +-- error cases: + +do $$ declare a int[]; +begin a := from onecol; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a := f1, f1 from onecol; raise notice 'a = %', a; end$$; + +insert into onecol values(array[11]); + +do $$ declare a int[]; +begin a := f1 from onecol; raise notice 'a = %', a; end$$; + +do $$ declare a int[]; +begin a := f1 from onecol limit 1; raise notice 'a = %', a; end$$; + +do $$ declare a real; +begin a[1] := 2; raise notice 'a = %', a; end$$; + +do $$ declare a complex; +begin a.r[1] := 2; raise notice 'a = %', a; end$$; diff --git a/src/pl/plpgsql/src/sql/plpgsql_record.sql b/src/pl/plpgsql/src/sql/plpgsql_record.sql index 128846e610..be10f00b1e 100644 --- a/src/pl/plpgsql/src/sql/plpgsql_record.sql +++ b/src/pl/plpgsql/src/sql/plpgsql_record.sql @@ -4,6 +4,7 @@ create type two_int4s as (f1 int4, f2 int4); create type two_int8s as (q1 int8, q2 int8); +create type nested_int8s as (c1 two_int8s, c2 two_int8s); -- base-case return of a composite type create function retc(int) returns two_int8s language plpgsql as @@ -59,6 +60,47 @@ begin raise notice 'c8 = %', c8; end$$; +do $$ declare c two_int8s; d nested_int8s; +begin + c := row(1,2); + d := row(c, row(c.q1, c.q2+1)); + raise notice 'c = %, d = %', c, d; + c.q1 := 10; + d.c1 := row(11,12); + d.c2.q2 := 42; + raise notice 'c = %, d = %', c, d; + raise notice 'c.q1 = %, d.c2 = %', c.q1, d.c2; + raise notice '(d).c2.q2 = %', (d).c2.q2; -- doesn't work without parens + raise notice '(d.c2).q2 = %', (d.c2).q2; -- doesn't work without parens +end$$; + +-- block-qualified naming +do $$ <<b>> declare c two_int8s; d nested_int8s; +begin + b.c := row(1,2); + b.d := row(b.c, row(b.c.q1, b.c.q2+1)); + raise notice 'b.c = %, b.d = %', b.c, b.d; + b.c.q1 := 10; + b.d.c1 := row(11,12); + b.d.c2.q2 := 42; + raise notice 'b.c = %, b.d = %', b.c, b.d; + raise notice 'b.c.q1 = %, b.d.c2 = %', b.c.q1, b.d.c2; + raise notice '(b.d).c2.q2 = %', (b.d).c2.q2; -- doesn't work without parens + raise notice '(b.d.c2).q2 = %', (b.d.c2).q2; -- doesn't work without parens +end$$; + +-- error cases +do $$ declare c two_int8s; begin c.x = 1; end $$; +do $$ declare c nested_int8s; begin c.x = 1; end $$; +do $$ declare c nested_int8s; begin c.x.q1 = 1; end $$; +do $$ declare c nested_int8s; begin c.c2.x = 1; end $$; +do $$ declare c nested_int8s; begin d.c2.x = 1; end $$; +do $$ <<b>> declare c two_int8s; begin b.c.x = 1; end $$; +do $$ <<b>> declare c nested_int8s; begin b.c.x = 1; end $$; +do $$ <<b>> declare c nested_int8s; begin b.c.x.q1 = 1; end $$; +do $$ <<b>> declare c nested_int8s; begin b.c.c2.x = 1; end $$; +do $$ <<b>> declare c nested_int8s; begin b.d.c2.x = 1; end $$; + -- check passing composite result to another function create function getq1(two_int8s) returns int8 language plpgsql as $$ declare r two_int8s; begin r := $1; return r.q1; end $$; diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index cb5c7f9fea..5c1db1dcfb 100644 --- a/src/pl/plpgsql/src/pl_exec.c +++ b/src/pl/plpgsql/src/pl_exec.c @@ -1311,12 +1311,11 @@ copy_plpgsql_datums(PLpgSQL_execstate *estate, case PLPGSQL_DTYPE_ROW: case PLPGSQL_DTYPE_RECFIELD: - case PLPGSQL_DTYPE_ARRAYELEM: /* * These datum records are read-only at runtime, so no need to - * copy them (well, RECFIELD and ARRAYELEM contain cached - * data, but we'd just as soon centralize the caching anyway). + * copy them (well, RECFIELD contains cached data, but we'd + * just as soon centralize the caching anyway). */ outdatum = indatum; break; @@ -4138,9 +4137,6 @@ plpgsql_estate_setup(PLpgSQL_execstate *estate, * * NB: the result of the evaluation is no longer valid after this is done, * unless it is a pass-by-value datatype. - * - * NB: if you change this code, see also the hacks in exec_assign_value's - * PLPGSQL_DTYPE_ARRAYELEM case for partial cleanup after subscript evals. * ---------- */ static void @@ -5290,198 +5286,6 @@ exec_assign_value(PLpgSQL_execstate *estate, break; } - case PLPGSQL_DTYPE_ARRAYELEM: - { - /* - * Target is an element of an array - */ - PLpgSQL_arrayelem *arrayelem; - int nsubscripts; - int i; - PLpgSQL_expr *subscripts[MAXDIM]; - int subscriptvals[MAXDIM]; - Datum oldarraydatum, - newarraydatum, - coerced_value; - bool oldarrayisnull; - Oid parenttypoid; - int32 parenttypmod; - SPITupleTable *save_eval_tuptable; - MemoryContext oldcontext; - - /* - * We need to do subscript evaluation, which might require - * evaluating general expressions; and the caller might have - * done that too in order to prepare the input Datum. We have - * to save and restore the caller's SPI_execute result, if - * any. - */ - save_eval_tuptable = estate->eval_tuptable; - estate->eval_tuptable = NULL; - - /* - * To handle constructs like x[1][2] := something, we have to - * be prepared to deal with a chain of arrayelem datums. Chase - * back to find the base array datum, and save the subscript - * expressions as we go. (We are scanning right to left here, - * but want to evaluate the subscripts left-to-right to - * minimize surprises.) Note that arrayelem is left pointing - * to the leftmost arrayelem datum, where we will cache the - * array element type data. - */ - nsubscripts = 0; - do - { - arrayelem = (PLpgSQL_arrayelem *) target; - if (nsubscripts >= MAXDIM) - ereport(ERROR, - (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), - errmsg("number of array dimensions (%d) exceeds the maximum allowed (%d)", - nsubscripts + 1, MAXDIM))); - subscripts[nsubscripts++] = arrayelem->subscript; - target = estate->datums[arrayelem->arrayparentno]; - } while (target->dtype == PLPGSQL_DTYPE_ARRAYELEM); - - /* Fetch current value of array datum */ - exec_eval_datum(estate, target, - &parenttypoid, &parenttypmod, - &oldarraydatum, &oldarrayisnull); - - /* Update cached type data if necessary */ - if (arrayelem->parenttypoid != parenttypoid || - arrayelem->parenttypmod != parenttypmod) - { - Oid arraytypoid; - int32 arraytypmod = parenttypmod; - int16 arraytyplen; - Oid elemtypoid; - int16 elemtyplen; - bool elemtypbyval; - char elemtypalign; - - /* If target is domain over array, reduce to base type */ - arraytypoid = getBaseTypeAndTypmod(parenttypoid, - &arraytypmod); - - /* ... and identify the element type */ - elemtypoid = get_element_type(arraytypoid); - if (!OidIsValid(elemtypoid)) - ereport(ERROR, - (errcode(ERRCODE_DATATYPE_MISMATCH), - errmsg("subscripted object is not an array"))); - - /* Collect needed data about the types */ - arraytyplen = get_typlen(arraytypoid); - - get_typlenbyvalalign(elemtypoid, - &elemtyplen, - &elemtypbyval, - &elemtypalign); - - /* Now safe to update the cached data */ - arrayelem->parenttypoid = parenttypoid; - arrayelem->parenttypmod = parenttypmod; - arrayelem->arraytypoid = arraytypoid; - arrayelem->arraytypmod = arraytypmod; - arrayelem->arraytyplen = arraytyplen; - arrayelem->elemtypoid = elemtypoid; - arrayelem->elemtyplen = elemtyplen; - arrayelem->elemtypbyval = elemtypbyval; - arrayelem->elemtypalign = elemtypalign; - } - - /* - * Evaluate the subscripts, switch into left-to-right order. - * Like the expression built by ExecInitSubscriptingRef(), - * complain if any subscript is null. - */ - for (i = 0; i < nsubscripts; i++) - { - bool subisnull; - - subscriptvals[i] = - exec_eval_integer(estate, - subscripts[nsubscripts - 1 - i], - &subisnull); - if (subisnull) - ereport(ERROR, - (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), - errmsg("array subscript in assignment must not be null"))); - - /* - * Clean up in case the subscript expression wasn't - * simple. We can't do exec_eval_cleanup, but we can do - * this much (which is safe because the integer subscript - * value is surely pass-by-value), and we must do it in - * case the next subscript expression isn't simple either. - */ - if (estate->eval_tuptable != NULL) - SPI_freetuptable(estate->eval_tuptable); - estate->eval_tuptable = NULL; - } - - /* Now we can restore caller's SPI_execute result if any. */ - Assert(estate->eval_tuptable == NULL); - estate->eval_tuptable = save_eval_tuptable; - - /* Coerce source value to match array element type. */ - coerced_value = exec_cast_value(estate, - value, - &isNull, - valtype, - valtypmod, - arrayelem->elemtypoid, - arrayelem->arraytypmod); - - /* - * If the original array is null, cons up an empty array so - * that the assignment can proceed; we'll end with a - * one-element array containing just the assigned-to - * subscript. This only works for varlena arrays, though; for - * fixed-length array types we skip the assignment. We can't - * support assignment of a null entry into a fixed-length - * array, either, so that's a no-op too. This is all ugly but - * corresponds to the current behavior of execExpr*.c. - */ - if (arrayelem->arraytyplen > 0 && /* fixed-length array? */ - (oldarrayisnull || isNull)) - return; - - /* empty array, if any, and newarraydatum are short-lived */ - oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate)); - - if (oldarrayisnull) - oldarraydatum = PointerGetDatum(construct_empty_array(arrayelem->elemtypoid)); - - /* - * Build the modified array value. - */ - newarraydatum = array_set_element(oldarraydatum, - nsubscripts, - subscriptvals, - coerced_value, - isNull, - arrayelem->arraytyplen, - arrayelem->elemtyplen, - arrayelem->elemtypbyval, - arrayelem->elemtypalign); - - MemoryContextSwitchTo(oldcontext); - - /* - * Assign the new array to the base variable. It's never NULL - * at this point. Note that if the target is a domain, - * coercing the base array type back up to the domain will - * happen within exec_assign_value. - */ - exec_assign_value(estate, target, - newarraydatum, - false, - arrayelem->arraytypoid, - arrayelem->arraytypmod); - break; - } - default: elog(ERROR, "unrecognized dtype: %d", target->dtype); } @@ -5492,8 +5296,8 @@ exec_assign_value(PLpgSQL_execstate *estate, * * The type oid, typmod, value in Datum format, and null flag are returned. * - * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums; - * that's not needed because we never pass references to such datums to SPI. + * At present this doesn't handle PLpgSQL_expr datums; that's not needed + * because we never pass references to such datums to SPI. * * NOTE: the returned Datum points right at the stored value in the case of * pass-by-reference datatypes. Generally callers should take care not to diff --git a/src/pl/plpgsql/src/pl_funcs.c b/src/pl/plpgsql/src/pl_funcs.c index ee60ced583..17895872c0 100644 --- a/src/pl/plpgsql/src/pl_funcs.c +++ b/src/pl/plpgsql/src/pl_funcs.c @@ -768,9 +768,6 @@ plpgsql_free_function_memory(PLpgSQL_function *func) break; case PLPGSQL_DTYPE_RECFIELD: break; - case PLPGSQL_DTYPE_ARRAYELEM: - free_expr(((PLpgSQL_arrayelem *) d)->subscript); - break; default: elog(ERROR, "unrecognized data type: %d", d->dtype); } @@ -1704,12 +1701,6 @@ plpgsql_dumptree(PLpgSQL_function *func) ((PLpgSQL_recfield *) d)->fieldname, ((PLpgSQL_recfield *) d)->recparentno); break; - case PLPGSQL_DTYPE_ARRAYELEM: - printf("ARRAYELEM of VAR %d subscript ", - ((PLpgSQL_arrayelem *) d)->arrayparentno); - dump_expr(((PLpgSQL_arrayelem *) d)->subscript); - printf("\n"); - break; default: printf("??? unknown data type %d\n", d->dtype); } diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y index dece013e23..4e7df1c2bb 100644 --- a/src/pl/plpgsql/src/pl_gram.y +++ b/src/pl/plpgsql/src/pl_gram.y @@ -177,11 +177,10 @@ static void check_raise_parameters(PLpgSQL_stmt_raise *stmt); %type <list> decl_cursor_arglist %type <nsitem> decl_aliasitem -%type <expr> expr_until_semi expr_until_rightbracket +%type <expr> expr_until_semi %type <expr> expr_until_then expr_until_loop opt_expr_until_when %type <expr> opt_exitcond -%type <datum> assign_var %type <var> cursor_variable %type <datum> decl_cursor_arg %type <forvariable> for_variable @@ -1155,16 +1154,23 @@ getdiag_item : } ; -getdiag_target : assign_var +getdiag_target : T_DATUM { - if ($1->dtype == PLPGSQL_DTYPE_ROW || - $1->dtype == PLPGSQL_DTYPE_REC) + /* + * In principle we should support a getdiag_target + * that is an array element, but for now we don't, so + * just throw an error if next token is '['. + */ + if ($1.datum->dtype == PLPGSQL_DTYPE_ROW || + $1.datum->dtype == PLPGSQL_DTYPE_REC || + plpgsql_peek() == '[') ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), errmsg("\"%s\" is not a scalar variable", - ((PLpgSQL_variable *) $1)->refname), + NameOfDatum(&($1))), parser_errposition(@1))); - $$ = $1; + check_assignable($1.datum, @1); + $$ = $1.datum; } | T_WORD { @@ -1178,29 +1184,6 @@ getdiag_target : assign_var } ; - -assign_var : T_DATUM - { - check_assignable($1.datum, @1); - $$ = $1.datum; - } - | assign_var '[' expr_until_rightbracket - { - PLpgSQL_arrayelem *new; - - new = palloc0(sizeof(PLpgSQL_arrayelem)); - new->dtype = PLPGSQL_DTYPE_ARRAYELEM; - new->subscript = $3; - new->arrayparentno = $1->dno; - /* initialize cached type data to "not valid" */ - new->parenttypoid = InvalidOid; - - plpgsql_adddatum((PLpgSQL_datum *) new); - - $$ = (PLpgSQL_datum *) new; - } - ; - stmt_if : K_IF expr_until_then proc_sect stmt_elsifs stmt_else K_END K_IF ';' { PLpgSQL_stmt_if *new; @@ -2471,10 +2454,6 @@ expr_until_semi : { $$ = read_sql_expression(';', ";"); } ; -expr_until_rightbracket : - { $$ = read_sql_expression(']', "]"); } - ; - expr_until_then : { $$ = read_sql_expression(K_THEN, "THEN"); } ; @@ -3493,11 +3472,6 @@ check_assignable(PLpgSQL_datum *datum, int location) check_assignable(plpgsql_Datums[((PLpgSQL_recfield *) datum)->recparentno], location); break; - case PLPGSQL_DTYPE_ARRAYELEM: - /* assignable if parent array is */ - check_assignable(plpgsql_Datums[((PLpgSQL_arrayelem *) datum)->arrayparentno], - location); - break; default: elog(ERROR, "unrecognized dtype: %d", datum->dtype); break; diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h index d152a4354b..a7791dc490 100644 --- a/src/pl/plpgsql/src/plpgsql.h +++ b/src/pl/plpgsql/src/plpgsql.h @@ -64,7 +64,6 @@ typedef enum PLpgSQL_datum_type PLPGSQL_DTYPE_ROW, PLPGSQL_DTYPE_REC, PLPGSQL_DTYPE_RECFIELD, - PLPGSQL_DTYPE_ARRAYELEM, PLPGSQL_DTYPE_PROMISE } PLpgSQL_datum_type; @@ -261,7 +260,7 @@ typedef struct PLpgSQL_expr * Generic datum array item * * PLpgSQL_datum is the common supertype for PLpgSQL_var, PLpgSQL_row, - * PLpgSQL_rec, PLpgSQL_recfield, and PLpgSQL_arrayelem. + * PLpgSQL_rec, and PLpgSQL_recfield. */ typedef struct PLpgSQL_datum { @@ -422,30 +421,6 @@ typedef struct PLpgSQL_recfield /* if rectupledescid == INVALID_TUPLEDESC_IDENTIFIER, finfo isn't valid */ } PLpgSQL_recfield; -/* - * Element of array variable - */ -typedef struct PLpgSQL_arrayelem -{ - PLpgSQL_datum_type dtype; - int dno; - /* end of PLpgSQL_datum fields */ - - PLpgSQL_expr *subscript; - int arrayparentno; /* dno of parent array variable */ - - /* Remaining fields are cached info about the array variable's type */ - Oid parenttypoid; /* type of array variable; 0 if not yet set */ - int32 parenttypmod; /* typmod of array variable */ - Oid arraytypoid; /* OID of actual array type */ - int32 arraytypmod; /* typmod of array (and its elements too) */ - int16 arraytyplen; /* typlen of array type */ - Oid elemtypoid; /* OID of array element type */ - int16 elemtyplen; /* typlen of element type */ - bool elemtypbyval; /* element type is pass-by-value? */ - char elemtypalign; /* typalign of element type */ -} PLpgSQL_arrayelem; - /* * Item in the compilers namespace tree */ diff --git a/src/test/regress/expected/plpgsql.out b/src/test/regress/expected/plpgsql.out index 6ae1b6e3d4..5a0bd1610c 100644 --- a/src/test/regress/expected/plpgsql.out +++ b/src/test/regress/expected/plpgsql.out @@ -4230,7 +4230,6 @@ drop function tftest(int); create or replace function rttest() returns setof int as $$ declare rc int; - rca int[]; begin return query values(10),(20); get diagnostics rc = row_count; @@ -4239,12 +4238,11 @@ begin get diagnostics rc = row_count; raise notice '% %', found, rc; return query execute 'values(10),(20)'; - -- just for fun, let's use array elements as targets - get diagnostics rca[1] = row_count; - raise notice '% %', found, rca[1]; + get diagnostics rc = row_count; + raise notice '% %', found, rc; return query execute 'select * from (values(10),(20)) f(a) where false'; - get diagnostics rca[2] = row_count; - raise notice '% %', found, rca[2]; + get diagnostics rc = row_count; + raise notice '% %', found, rc; end; $$ language plpgsql; select * from rttest(); diff --git a/src/test/regress/sql/plpgsql.sql b/src/test/regress/sql/plpgsql.sql index 07c60c80e4..781666a83a 100644 --- a/src/test/regress/sql/plpgsql.sql +++ b/src/test/regress/sql/plpgsql.sql @@ -3497,7 +3497,6 @@ drop function tftest(int); create or replace function rttest() returns setof int as $$ declare rc int; - rca int[]; begin return query values(10),(20); get diagnostics rc = row_count; @@ -3506,12 +3505,11 @@ begin get diagnostics rc = row_count; raise notice '% %', found, rc; return query execute 'values(10),(20)'; - -- just for fun, let's use array elements as targets - get diagnostics rca[1] = row_count; - raise notice '% %', found, rca[1]; + get diagnostics rc = row_count; + raise notice '% %', found, rc; return query execute 'select * from (values(10),(20)) f(a) where false'; - get diagnostics rca[2] = row_count; - raise notice '% %', found, rca[2]; + get diagnostics rc = row_count; + raise notice '% %', found, rc; end; $$ language plpgsql; diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index 03e0049af8..aa2e52fa40 100644 --- a/src/pl/plpgsql/src/pl_exec.c +++ b/src/pl/plpgsql/src/pl_exec.c @@ -333,8 +333,7 @@ static void exec_prepare_plan(PLpgSQL_execstate *estate, bool keepplan); static void exec_simple_check_plan(PLpgSQL_execstate *estate, PLpgSQL_expr *expr); static void exec_save_simple_expr(PLpgSQL_expr *expr, CachedPlan *cplan); -static void exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno); -static bool contains_target_param(Node *node, int *target_dno); +static void exec_check_rw_parameter(PLpgSQL_expr *expr); static bool exec_eval_simple_expr(PLpgSQL_execstate *estate, PLpgSQL_expr *expr, Datum *result, @@ -4190,13 +4189,6 @@ exec_prepare_plan(PLpgSQL_execstate *estate, /* Check to see if it's a simple expression */ exec_simple_check_plan(estate, expr); - - /* - * Mark expression as not using a read-write param. exec_assign_value has - * to take steps to override this if appropriate; that seems cleaner than - * adding parameters to all other callers. - */ - expr->rwparam = -1; } @@ -5024,16 +5016,23 @@ exec_assign_expr(PLpgSQL_execstate *estate, PLpgSQL_datum *target, int32 valtypmod; /* - * If first time through, create a plan for this expression, and then see - * if we can pass the target variable as a read-write parameter to the - * expression. (This is a bit messy, but it seems cleaner than modifying - * the API of exec_eval_expr for the purpose.) + * If first time through, create a plan for this expression. */ if (expr->plan == NULL) { - exec_prepare_plan(estate, expr, 0, true); + /* + * Mark the expression as being an assignment source, if target is a + * simple variable. (This is a bit messy, but it seems cleaner than + * modifying the API of exec_prepare_plan for the purpose. We need to + * stash the target dno into the expr anyway, so it is available if we + * have to replan.) + */ if (target->dtype == PLPGSQL_DTYPE_VAR) - exec_check_rw_parameter(expr, target->dno); + expr->target_param = target->dno; + else + expr->target_param = -1; /* should be that already */ + + exec_prepare_plan(estate, expr, 0, true); } value = exec_eval_expr(estate, expr, &isnull, &valtype, &valtypmod); @@ -6098,6 +6097,7 @@ exec_eval_simple_expr(PLpgSQL_execstate *estate, ReleaseCachedPlan(cplan, true); /* Mark expression as non-simple, and fail */ expr->expr_simple_expr = NULL; + expr->expr_rw_param = NULL; return false; } @@ -6109,10 +6109,6 @@ exec_eval_simple_expr(PLpgSQL_execstate *estate, /* Extract desired scalar expression from cached plan */ exec_save_simple_expr(expr, cplan); - - /* better recheck r/w safety, as it could change due to inlining */ - if (expr->rwparam >= 0) - exec_check_rw_parameter(expr, expr->rwparam); } /* @@ -6385,20 +6381,18 @@ plpgsql_param_fetch(ParamListInfo params, prm->pflags = PARAM_FLAG_CONST; /* - * If it's a read/write expanded datum, convert reference to read-only, - * unless it's safe to pass as read-write. + * If it's a read/write expanded datum, convert reference to read-only. + * (There's little point in trying to optimize read/write parameters, + * given the cases in which this function is used.) */ - if (dno != expr->rwparam) - { - if (datum->dtype == PLPGSQL_DTYPE_VAR) - prm->value = MakeExpandedObjectReadOnly(prm->value, - prm->isnull, - ((PLpgSQL_var *) datum)->datatype->typlen); - else if (datum->dtype == PLPGSQL_DTYPE_REC) - prm->value = MakeExpandedObjectReadOnly(prm->value, - prm->isnull, - -1); - } + if (datum->dtype == PLPGSQL_DTYPE_VAR) + prm->value = MakeExpandedObjectReadOnly(prm->value, + prm->isnull, + ((PLpgSQL_var *) datum)->datatype->typlen); + else if (datum->dtype == PLPGSQL_DTYPE_REC) + prm->value = MakeExpandedObjectReadOnly(prm->value, + prm->isnull, + -1); return prm; } @@ -6441,7 +6435,7 @@ plpgsql_param_compile(ParamListInfo params, Param *param, */ if (datum->dtype == PLPGSQL_DTYPE_VAR) { - if (dno != expr->rwparam && + if (param != expr->expr_rw_param && ((PLpgSQL_var *) datum)->datatype->typlen == -1) scratch.d.cparam.paramfunc = plpgsql_param_eval_var_ro; else @@ -6451,14 +6445,14 @@ plpgsql_param_compile(ParamListInfo params, Param *param, scratch.d.cparam.paramfunc = plpgsql_param_eval_recfield; else if (datum->dtype == PLPGSQL_DTYPE_PROMISE) { - if (dno != expr->rwparam && + if (param != expr->expr_rw_param && ((PLpgSQL_var *) datum)->datatype->typlen == -1) scratch.d.cparam.paramfunc = plpgsql_param_eval_generic_ro; else scratch.d.cparam.paramfunc = plpgsql_param_eval_generic; } else if (datum->dtype == PLPGSQL_DTYPE_REC && - dno != expr->rwparam) + param != expr->expr_rw_param) scratch.d.cparam.paramfunc = plpgsql_param_eval_generic_ro; else scratch.d.cparam.paramfunc = plpgsql_param_eval_generic; @@ -7930,6 +7924,7 @@ exec_simple_check_plan(PLpgSQL_execstate *estate, PLpgSQL_expr *expr) * Initialize to "not simple". */ expr->expr_simple_expr = NULL; + expr->expr_rw_param = NULL; /* * Check the analyzed-and-rewritten form of the query to see if we will be @@ -8108,6 +8103,12 @@ exec_save_simple_expr(PLpgSQL_expr *expr, CachedPlan *cplan) expr->expr_simple_typmod = exprTypmod((Node *) tle_expr); /* We also want to remember if it is immutable or not */ expr->expr_simple_mutable = contain_mutable_functions((Node *) tle_expr); + + /* + * Lastly, check to see if there's a possibility of optimizing a + * read/write parameter. + */ + exec_check_rw_parameter(expr); } /* @@ -8119,25 +8120,36 @@ exec_save_simple_expr(PLpgSQL_expr *expr, CachedPlan *cplan) * value as a read/write pointer and let the function modify the value * in-place. * - * This function checks for a safe expression, and sets expr->rwparam to the - * dno of the target variable (x) if safe, or -1 if not safe. + * This function checks for a safe expression, and sets expr->expr_rw_param + * to the address of any Param within the expression that can be passed as + * read/write (there can be only one); or to NULL when there is no safe Param. + * + * Note that this mechanism intentionally applies the safety labeling to just + * one Param; the expression could contain other Params referencing the target + * variable, but those must still be treated as read-only. + * + * Also note that we only apply this optimization within simple expressions. + * There's no point in it for non-simple expressions, because the + * exec_run_select code path will flatten any expanded result anyway. + * Also, it's safe to assume that an expr_simple_expr tree won't get copied + * somewhere before it gets compiled, so that looking for pointer equality + * to expr_rw_param will work for matching the target Param. That'd be much + * shakier in the general case. */ static void -exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno) +exec_check_rw_parameter(PLpgSQL_expr *expr) { + int target_dno; Oid funcid; List *fargs; ListCell *lc; /* Assume unsafe */ - expr->rwparam = -1; + expr->expr_rw_param = NULL; - /* - * If the expression isn't simple, there's no point in trying to optimize - * (because the exec_run_select code path will flatten any expanded result - * anyway). Even without that, this seems like a good safety restriction. - */ - if (expr->expr_simple_expr == NULL) + /* Done if expression isn't an assignment source */ + target_dno = expr->target_param; + if (target_dno < 0) return; /* @@ -8147,9 +8159,12 @@ exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno) if (!bms_is_member(target_dno, expr->paramnos)) return; + /* Shouldn't be here for non-simple expression */ + Assert(expr->expr_simple_expr != NULL); + /* * Top level of expression must be a simple FuncExpr, OpExpr, or - * SubscriptingRef. + * SubscriptingRef, else we can't optimize. */ if (IsA(expr->expr_simple_expr, FuncExpr)) { @@ -8174,22 +8189,20 @@ exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno) F_ARRAY_SUBSCRIPT_HANDLER) return; - /* refexpr can be a simple Param, otherwise must not contain target */ - if (!(sbsref->refexpr && IsA(sbsref->refexpr, Param)) && - contains_target_param((Node *) sbsref->refexpr, &target_dno)) - return; + /* We can optimize the refexpr if it's the target, otherwise not */ + if (sbsref->refexpr && IsA(sbsref->refexpr, Param)) + { + Param *param = (Param *) sbsref->refexpr; - /* the other subexpressions must not contain target */ - if (contains_target_param((Node *) sbsref->refupperindexpr, - &target_dno) || - contains_target_param((Node *) sbsref->reflowerindexpr, - &target_dno) || - contains_target_param((Node *) sbsref->refassgnexpr, - &target_dno)) - return; + if (param->paramkind == PARAM_EXTERN && + param->paramid == target_dno + 1) + { + /* Found the Param we want to pass as read/write */ + expr->expr_rw_param = param; + return; + } + } - /* OK, we can pass target as a read-write parameter */ - expr->rwparam = target_dno; return; } else @@ -8205,44 +8218,28 @@ exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno) return; /* - * The target variable (in the form of a Param) must only appear as a - * direct argument of the top-level function. + * The target variable (in the form of a Param) must appear as a direct + * argument of the top-level function. References further down in the + * tree can't be optimized; but on the other hand, they don't invalidate + * optimizing the top-level call, since that will be executed last. */ foreach(lc, fargs) { Node *arg = (Node *) lfirst(lc); - /* A Param is OK, whether it's the target variable or not */ if (arg && IsA(arg, Param)) - continue; - /* Otherwise, argument expression must not reference target */ - if (contains_target_param(arg, &target_dno)) - return; - } - - /* OK, we can pass target as a read-write parameter */ - expr->rwparam = target_dno; -} - -/* - * Recursively check for a Param referencing the target variable - */ -static bool -contains_target_param(Node *node, int *target_dno) -{ - if (node == NULL) - return false; - if (IsA(node, Param)) - { - Param *param = (Param *) node; + { + Param *param = (Param *) arg; - if (param->paramkind == PARAM_EXTERN && - param->paramid == *target_dno + 1) - return true; - return false; + if (param->paramkind == PARAM_EXTERN && + param->paramid == target_dno + 1) + { + /* Found the Param we want to pass as read/write */ + expr->expr_rw_param = param; + return; + } + } } - return expression_tree_walker(node, contains_target_param, - (void *) target_dno); } /* ---------- diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y index 4e7df1c2bb..8a46ebbda8 100644 --- a/src/pl/plpgsql/src/pl_gram.y +++ b/src/pl/plpgsql/src/pl_gram.y @@ -2820,7 +2820,7 @@ read_sql_construct(int until, expr->parseMode = parsemode; expr->plan = NULL; expr->paramnos = NULL; - expr->rwparam = -1; + expr->target_param = -1; expr->ns = plpgsql_ns_top(); pfree(ds.data); @@ -3067,7 +3067,7 @@ make_execsql_stmt(int firsttoken, int location) expr->parseMode = RAW_PARSE_DEFAULT; expr->plan = NULL; expr->paramnos = NULL; - expr->rwparam = -1; + expr->target_param = -1; expr->ns = plpgsql_ns_top(); pfree(ds.data); @@ -3949,7 +3949,7 @@ read_cursor_args(PLpgSQL_var *cursor, int until) expr->parseMode = RAW_PARSE_PLPGSQL_EXPR; expr->plan = NULL; expr->paramnos = NULL; - expr->rwparam = -1; + expr->target_param = -1; expr->ns = plpgsql_ns_top(); pfree(ds.data); diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h index a7791dc490..8e8aa50e3c 100644 --- a/src/pl/plpgsql/src/plpgsql.h +++ b/src/pl/plpgsql/src/plpgsql.h @@ -221,7 +221,6 @@ typedef struct PLpgSQL_expr RawParseMode parseMode; /* raw_parser() mode to use */ SPIPlanPtr plan; /* plan, or NULL if not made yet */ Bitmapset *paramnos; /* all dnos referenced by this query */ - int rwparam; /* dno of read/write param, or -1 if none */ /* function containing this expr (not set until we first parse query) */ struct PLpgSQL_function *func; @@ -235,6 +234,17 @@ typedef struct PLpgSQL_expr int32 expr_simple_typmod; /* result typmod, if simple */ bool expr_simple_mutable; /* true if simple expr is mutable */ + /* + * These fields are used to optimize assignments to expanded-datum + * variables. If this expression is the source of an assignment to a + * simple variable, target_param holds that variable's dno; else it's -1. + * If we match a Param within expr_simple_expr to such a variable, that + * Param's address is stored in expr_rw_param; then expression code + * generation will allow the value for that Param to be passed read/write. + */ + int target_param; /* dno of assign target, or -1 if none */ + Param *expr_rw_param; /* read/write Param within expr, if any */ + /* * If the expression was ever determined to be simple, we remember its * CachedPlanSource and CachedPlan here. If expr_simple_plan_lxid matches
út 15. 12. 2020 v 21:18 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
I realized that the speedup patch I posted yesterday is flawed: it's
too aggressive about applying the R/W param mechanism, instead of
not aggressive enough.
To review, the point of that logic is that if we have an assignment
like
arrayvar := array_append(arrayvar, some-scalar-expression);
a naive implementation would have array_append construct an entire
new array, which we'd then have to copy into plpgsql's variable
storage. Instead, if the array variable is in expanded-array
format (which plpgsql encourages it to be) then we can pass the
array parameter as a "read/write expanded datum", which array_append
recognizes as license to scribble right on its input and return the
modified input; that takes only O(1) time not O(N). Then plpgsql's
assignment code notices that the expression result datum is the same
pointer already stored in the variable, so it does nothing.
With the patch at hand, a subscripted assignment a[i] := x becomes,
essentially,
a := subscriptingref(a, i, x);
and we need to make the same sort of transformation to allow
array_set_element to scribble right on the original value of "a"
instead of making a copy.
However, we can't simply not consider the source expression "x",
as I proposed yesterday. For example, if we have
a := subscriptingref(a, i, f(array_append(a, x)));
it's not okay for array_append() to scribble on "a". The R/W
param mechanism normally disallows any additional references to
the target variable, which would prevent this error, but I broke
that safety check with the 0007 patch.
After thinking about this awhile, I decided that plpgsql's R/W param
mechanism is really misdesigned. Instead of requiring the assignment
source expression to be such that *all* its references to the target
variable could be passed as R/W, we really want to identify *one*
reference to the target variable to be passed as R/W, allowing any other
ones to be passed read/only as they would be by default. As long as the
R/W reference is a direct argument to the top-level (hence last to be
executed) function in the expression, there is no harm in R/O references
being passed to other lower parts of the expression. Nor is there any
use-case for more than one argument of the top-level function being R/W.
So the attached rewrite of the 0007 patch reimplements that logic to
identify one single Param that references the target variable, and
make only that Param pass a read/write reference, not any other
Params referencing the target variable. This is a good change even
without considering the assignment-reimplementation proposal, because
even before this patchset we could have cases like
arrayvar := array_append(arrayvar, arrayvar[i]);
The existing code would be afraid to optimize this, but it's in fact
safe.
I also re-attach the 0001-0006 patches, which have not changed, just
to keep the cfbot happy.
I run some performance tests and it looks very well.
regards, tom lane
Hi
I repeated tests. I wrote a set of simple functions. It is a synthetical test, but I think it can identify potential problems well.
I calculated the average of 3 cycles and I checked the performance of each function. I didn't find any problem. The total execution time is well too. Patched code is about 11% faster than master (14sec x 15.8sec). So there is new important functionality with nice performance benefits.
make check-world passed
Regards
Pavel
Attachment
so 26. 12. 2020 v 19:00 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
HiI repeated tests. I wrote a set of simple functions. It is a synthetical test, but I think it can identify potential problems well.I calculated the average of 3 cycles and I checked the performance of each function. I didn't find any problem. The total execution time is well too. Patched code is about 11% faster than master (14sec x 15.8sec). So there is new important functionality with nice performance benefits.make check-world passed
I played with plpgsql_check tests and again I didn't find any significant issue of this patch. I am very satisfied with implementation.
Now, the behavior of SELECT INTO is behind the assign statement and this fact should be documented. Usually we don't need to use array's fields here, but somebody can try it.
Regards
Pavel
RegardsPavel
Pavel Stehule <pavel.stehule@gmail.com> writes: > Now, the behavior of SELECT INTO is behind the assign statement and this > fact should be documented. Usually we don't need to use array's fields > here, but somebody can try it. It's been behind all along --- this patch didn't really change that. But I don't mind documenting it more clearly. regards, tom lane
po 28. 12. 2020 v 0:55 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> Now, the behavior of SELECT INTO is behind the assign statement and this
> fact should be documented. Usually we don't need to use array's fields
> here, but somebody can try it.
It's been behind all along --- this patch didn't really change that.
But I don't mind documenting it more clearly.
ok
Pavel
regards, tom lane
Hi
I continue in review.
I found inconsistency in work with slicings (this is not directly related to this patch, but can be interesting, because with new functionality the array slicings can be edited more often).
a = array[1,2,3,4,5];
a[1:5] = 10; -- correctly fails, although for some people can be more natural semantic setting a[1..5] to value 10
a[1:5] = NULL; does nothing - no fail, no value change ??? Is it correct
a[1:5] = ARRAY[1]; -- correctly fails ERROR: source array too small
but
a[1:5] = ARRAY[1,2,3,4,5,6]; -- this statement works, but 6 is ignored. Is it correct? I expected "source array too big"
More, this behave is not documented
anything other looks well, all tests passed, and in my benchmarks I don't see any slowdowns , so I'll mark this patch as ready for committer
Regards
Pavel
Pavel Stehule <pavel.stehule@gmail.com> writes: > I found inconsistency in work with slicings (this is not directly related > to this patch, but can be interesting, because with new functionality the > array slicings can be edited more often). > a = array[1,2,3,4,5]; > a[1:5] = 10; -- correctly fails, although for some people can be more > natural semantic setting a[1..5] to value 10 > a[1:5] = NULL; does nothing - no fail, no value change ??? Is it correct > a[1:5] = ARRAY[1]; -- correctly fails ERROR: source array too small > but > a[1:5] = ARRAY[1,2,3,4,5,6]; -- this statement works, but 6 is ignored. Is > it correct? I expected "source array too big" Hm. All of these behaviors have existed for a long time in the context of UPDATE statements: regression=# create table t1 (a int[]); CREATE TABLE regression=# insert into t1 values(array[1,2,3,4,5]); INSERT 0 1 regression=# table t1; a ------------- {1,2,3,4,5} (1 row) regression=# update t1 set a[1:5] = 10; ERROR: subscripted assignment to "a" requires type integer[] but expression is of type integer regression=# update t1 set a[1:5] = null; UPDATE 1 regression=# table t1; a ------------- {1,2,3,4,5} (1 row) (Note that in this example, the null is implicitly typed as int[]; so it's not like the prior example.) regression=# update t1 set a[1:5] = array[1]; ERROR: source array too small regression=# update t1 set a[1:5] = array[1,2,3,4,6,5]; UPDATE 1 regression=# table t1; a ------------- {1,2,3,4,6} (1 row) I agree this is inconsistent, but given the way this patch works, we'd have to change UPDATE's behavior if we want plpgsql to do something different. Not sure if we can get away with that. > anything other looks well, all tests passed, and in my benchmarks I don't > see any slowdowns , so I'll mark this patch as ready for committer Thanks! regards, tom lane
ne 3. 1. 2021 v 19:07 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I found inconsistency in work with slicings (this is not directly related
> to this patch, but can be interesting, because with new functionality the
> array slicings can be edited more often).
> a = array[1,2,3,4,5];
> a[1:5] = 10; -- correctly fails, although for some people can be more
> natural semantic setting a[1..5] to value 10
> a[1:5] = NULL; does nothing - no fail, no value change ??? Is it correct
> a[1:5] = ARRAY[1]; -- correctly fails ERROR: source array too small
> but
> a[1:5] = ARRAY[1,2,3,4,5,6]; -- this statement works, but 6 is ignored. Is
> it correct? I expected "source array too big"
Hm. All of these behaviors have existed for a long time in the context
of UPDATE statements:
regression=# create table t1 (a int[]);
CREATE TABLE
regression=# insert into t1 values(array[1,2,3,4,5]);
INSERT 0 1
regression=# table t1;
a
-------------
{1,2,3,4,5}
(1 row)
regression=# update t1 set a[1:5] = 10;
ERROR: subscripted assignment to "a" requires type integer[] but expression is of type integer
regression=# update t1 set a[1:5] = null;
UPDATE 1
regression=# table t1;
a
-------------
{1,2,3,4,5}
(1 row)
(Note that in this example, the null is implicitly typed as int[];
so it's not like the prior example.)
I understand
regression=# update t1 set a[1:5] = array[1];
ERROR: source array too small
regression=# update t1 set a[1:5] = array[1,2,3,4,6,5];
UPDATE 1
regression=# table t1;
a
-------------
{1,2,3,4,6}
(1 row)
I agree this is inconsistent, but given the way this patch works,
we'd have to change UPDATE's behavior if we want plpgsql to do
something different. Not sure if we can get away with that.
Yes, the UPDATE should be changed. This is not a pretty important corner case. But any inconsistency can be messy for users.
I don't see any interesting use case for current behavior, but it is a corner case.
> anything other looks well, all tests passed, and in my benchmarks I don't
> see any slowdowns , so I'll mark this patch as ready for committer
Thanks!
with pleasure
Regards
Pavel
regards, tom lane
Hi
Now, I am testing subscribing on the jsonb feature, and I found one issue, that is not supported by parser.
When the target is scalar, then all is ok. But we can have a plpgsql array of jsonb values.
postgres=# do $$
declare j jsonb[];
begin
j[1] = '{"b":"Ahoj"}';
raise notice '%', j;
raise notice '%', (j[1])['b'];
end
$$;
NOTICE: {"{\"b\": \"Ahoj\"}"}
NOTICE: "Ahoj"
DO
declare j jsonb[];
begin
j[1] = '{"b":"Ahoj"}';
raise notice '%', j;
raise notice '%', (j[1])['b'];
end
$$;
NOTICE: {"{\"b\": \"Ahoj\"}"}
NOTICE: "Ahoj"
DO
Parenthesis work well in expressions, but are not supported on the left side of assignment.
postgres=# do $$
declare j jsonb[];
begin
(j[1])['b'] = '"Ahoj"';
raise notice '%', j;
raise notice '%', j[1]['b'];
end
$$;
ERROR: syntax error at or near "("
LINE 4: (j[1])['b'] = '"Ahoj"';
^
declare j jsonb[];
begin
(j[1])['b'] = '"Ahoj"';
raise notice '%', j;
raise notice '%', j[1]['b'];
end
$$;
ERROR: syntax error at or near "("
LINE 4: (j[1])['b'] = '"Ahoj"';
^
Regards
Pavel
út 19. 1. 2021 v 19:21 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
HiNow, I am testing subscribing on the jsonb feature, and I found one issue, that is not supported by parser.When the target is scalar, then all is ok. But we can have a plpgsql array of jsonb values.postgres=# do $$
declare j jsonb[];
begin
j[1] = '{"b":"Ahoj"}';
raise notice '%', j;
raise notice '%', (j[1])['b'];
end
$$;
NOTICE: {"{\"b\": \"Ahoj\"}"}
NOTICE: "Ahoj"
DOParenthesis work well in expressions, but are not supported on the left side of assignment.postgres=# do $$
declare j jsonb[];
begin
(j[1])['b'] = '"Ahoj"';
raise notice '%', j;
raise notice '%', j[1]['b'];
end
$$;
ERROR: syntax error at or near "("
LINE 4: (j[1])['b'] = '"Ahoj"';
^
Assignment for nesting composite types is working better - although there is some inconsistency too:
create type t_inner as (x int, y int);
create type t_outer as (a t_inner, b t_inner);
do $$
declare v t_outer;
begin
v.a.x := 10; -- parenthesis not allowed here, but not required
raise notice '%', v;
raise notice '%', (v).a.x; -- parenthesis are required here
end;
$$;
declare v t_outer;
begin
v.a.x := 10; -- parenthesis not allowed here, but not required
raise notice '%', v;
raise notice '%', (v).a.x; -- parenthesis are required here
end;
$$;
Regards
Pavel
RegardsPavel