Re: SQL-standard function body - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: SQL-standard function body |
Date | |
Msg-id | 2197698.1617984583@sss.pgh.pa.us Whole thread Raw |
In response to | Re: SQL-standard function body (Julien Rouhaud <rjuju123@gmail.com>) |
Responses |
Re: SQL-standard function body
Re: SQL-standard function body |
List | pgsql-hackers |
Julien Rouhaud <rjuju123@gmail.com> writes: > On Thu, Apr 08, 2021 at 04:54:56PM +0900, Michael Paquier wrote: >> Indeed, I agree that enforcing the availability of querystring >> everywhere sounds like a sensible thing to do in terms of consistency, >> and that's my impression when I scanned the parallel execution code, >> and I don't really get why SQL function bodies should not bind by this >> rule. Would people object if I add an open item to track that? > It makes sense, +1 for an open item. So here's what I propose to do about this. 0001 attached reverts the patch's change to remove the NOT NULL constraint on pg_proc.prosrc. I think that was an extremely poor decision; it risks breaking non-core PLs, and for that matter I'm not sure the core PLs wouldn't crash on null prosrc. It is not any harder for the SQL-language-related code to condition its checks on not-null prosqlbody instead of null prosrc. Of course that then requires us to put something into prosrc for these newfangled functions, but in 0001 I just used an empty string. (This patch also adds an Assert to standard_ExecutorStart checking that some source text was provided, responding to Andres' point that we should be checking that upstream of parallel query. We should then revert b3ee4c503, but for simplicity I didn't include that here.) 0002 addresses a different missing-source-text problem, which is that the patch didn't bother to provide source text while running parse analysis on the SQL function body. That means no error cursors for problems; which might seem cosmetic on the toy example I added to the regression tests, but it won't be for people writing functions that are dozens or hundreds of lines long. Finally, 0003 might be a bit controversial: it changes the stored prosrc for new-style SQL functions to be the query text of the CREATE FUNCTION command. The main argument I can see being made against this is that it'll bloat the pg_proc entry. But I think that that's not a terribly reasonable concern, because the source text is going to be a good deal smaller than the nodeToString representation in just about every case. The real value of 0003 of course would be to get an error cursor at runtime, but I failed to create an example where that would happen today. Right now there are only three calls of executor_errposition, and all of them are for cases that are already rejected by the parser, so they're effectively unreachable. A scenario that seems more likely to be reachable is a failure reported during function inlining, but most of the reasons I can think of for that also seem unreachable given the already-parse-analyzed nature of the function body in these cases. Maybe I'm just under-caffeinated today. Another point here is that for any error cursor to appear, we need not only source text at hand but also token locations in the query tree nodes. Right now, since readfuncs.c intentionally discards those locations, we won't have that. There is not-normally-compiled logic to reload those location fields, though, and I think before too long we'll want to enable it in some mainstream cases --- notably parallel query's shipping of querytrees to workers. However, until it gets easier to reach cases where an error-with-location can be thrown from the executor, I don't feel a need to do that. I do have ambitions to make execution-time errors produce cursors in more cases, so I think this will come to fruition before long, but not in v14. One could make an argument, therefore, for holding off 0003 until there's more support for execution-time error cursors. I don't think we should though, for two reasons: 1. It'd be better to keep the pg_proc representation of new-style SQL functions stable across versions. 2. Storing the CREATE text means we'll capture comments associated with the function text, which is something that at least some people will complain about the loss of. Admittedly we have no way to re-integrate the comments into the de-parsed body, but some folks might be satisfied with grabbing the prosrc text. Thoughts? regards, tom lane diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c index cd13e63852..478dbde3fe 100644 --- a/src/backend/catalog/pg_proc.c +++ b/src/backend/catalog/pg_proc.c @@ -121,7 +121,7 @@ ProcedureCreate(const char *procedureName, /* * sanity checks */ - Assert(PointerIsValid(prosrc) || PointerIsValid(prosqlbody)); + Assert(PointerIsValid(prosrc)); parameterCount = parameterTypes->dim1; if (parameterCount < 0 || parameterCount > FUNC_MAX_ARGS) @@ -336,10 +336,7 @@ ProcedureCreate(const char *procedureName, values[Anum_pg_proc_protrftypes - 1] = trftypes; else nulls[Anum_pg_proc_protrftypes - 1] = true; - if (prosrc) - values[Anum_pg_proc_prosrc - 1] = CStringGetTextDatum(prosrc); - else - nulls[Anum_pg_proc_prosrc - 1] = true; + values[Anum_pg_proc_prosrc - 1] = CStringGetTextDatum(prosrc); if (probin) values[Anum_pg_proc_probin - 1] = CStringGetTextDatum(probin); else @@ -874,26 +871,29 @@ fmgr_sql_validator(PG_FUNCTION_ARGS) /* Postpone body checks if !check_function_bodies */ if (check_function_bodies) { + tmp = SysCacheGetAttr(PROCOID, tuple, Anum_pg_proc_prosrc, &isnull); + if (isnull) + elog(ERROR, "null prosrc"); + + prosrc = TextDatumGetCString(tmp); + /* * Setup error traceback support for ereport(). */ callback_arg.proname = NameStr(proc->proname); - callback_arg.prosrc = NULL; + callback_arg.prosrc = prosrc; sqlerrcontext.callback = sql_function_parse_error_callback; sqlerrcontext.arg = (void *) &callback_arg; sqlerrcontext.previous = error_context_stack; error_context_stack = &sqlerrcontext; - tmp = SysCacheGetAttr(PROCOID, tuple, Anum_pg_proc_prosrc, &isnull); - if (isnull) + /* If we have prosqlbody, pay attention to that not prosrc */ + tmp = SysCacheGetAttr(PROCOID, tuple, Anum_pg_proc_prosqlbody, &isnull); + if (!isnull) { Node *n; - tmp = SysCacheGetAttr(PROCOID, tuple, Anum_pg_proc_prosqlbody, &isnull); - if (isnull) - elog(ERROR, "null prosrc and prosqlbody"); - n = stringToNode(TextDatumGetCString(tmp)); if (IsA(n, List)) querytree_list = castNode(List, n); @@ -902,10 +902,6 @@ fmgr_sql_validator(PG_FUNCTION_ARGS) } else { - prosrc = TextDatumGetCString(tmp); - - callback_arg.prosrc = prosrc; - /* * We can't do full prechecking of the function definition if there * are any polymorphic input types, because actual datatypes of @@ -1001,9 +997,6 @@ function_parse_error_transpose(const char *prosrc) int newerrposition; const char *queryText; - if (!prosrc) - return false; - /* * Nothing to do unless we are dealing with a syntax error that has a * cursor position. diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c index 199029b7a8..dc317c83af 100644 --- a/src/backend/commands/functioncmds.c +++ b/src/backend/commands/functioncmds.c @@ -958,8 +958,17 @@ interpret_AS_clause(Oid languageOid, const char *languageName, *sql_body_out = (Node *) q; } + /* + * We must put something in prosrc. For the moment, just record an + * empty string. It might be useful to store the original text of the + * CREATE FUNCTION statement --- but to make actual use of that in + * error reports, we'd also have to adjust readfuncs.c to not throw + * away node location fields when reading prosqlbody. + */ + *prosrc_str_p = pstrdup(""); + + /* But we definitely don't need probin. */ *probin_str_p = NULL; - *prosrc_str_p = NULL; } else { diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index b2e2df8773..2cf6dad768 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -195,6 +195,8 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags) palloc0(nParamExec * sizeof(ParamExecData)); } + /* We now require all callers to provide sourceText */ + Assert(queryDesc->sourceText != NULL); estate->es_sourceText = queryDesc->sourceText; /* diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c index 642683843e..39580f7d57 100644 --- a/src/backend/executor/functions.c +++ b/src/backend/executor/functions.c @@ -667,6 +667,15 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) procedureTuple, Anum_pg_proc_prosrc, &isNull); + if (isNull) + elog(ERROR, "null prosrc for function %u", foid); + fcache->src = TextDatumGetCString(tmp); + + /* If we have prosqlbody, pay attention to that not prosrc. */ + tmp = SysCacheGetAttr(PROCOID, + procedureTuple, + Anum_pg_proc_prosqlbody, + &isNull); /* * Parse and rewrite the queries in the function text. Use sublists to @@ -678,18 +687,11 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) * plancache.c. */ queryTree_list = NIL; - if (isNull) + if (!isNull) { Node *n; List *stored_query_list; - tmp = SysCacheGetAttr(PROCOID, - procedureTuple, - Anum_pg_proc_prosqlbody, - &isNull); - if (isNull) - elog(ERROR, "null prosrc and prosqlbody for function %u", foid); - n = stringToNode(TextDatumGetCString(tmp)); if (IsA(n, List)) stored_query_list = linitial_node(List, castNode(List, n)); @@ -710,8 +712,6 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) { List *raw_parsetree_list; - fcache->src = TextDatumGetCString(tmp); - raw_parsetree_list = pg_parse_query(fcache->src); foreach(lc, raw_parsetree_list) diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index 526997327c..d9ad4efc5e 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -4317,32 +4317,37 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid, ALLOCSET_DEFAULT_SIZES); oldcxt = MemoryContextSwitchTo(mycxt); + /* Fetch the function body */ + tmp = SysCacheGetAttr(PROCOID, + func_tuple, + Anum_pg_proc_prosrc, + &isNull); + if (isNull) + elog(ERROR, "null prosrc for function %u", funcid); + src = TextDatumGetCString(tmp); + /* * Setup error traceback support for ereport(). This is so that we can * finger the function that bad information came from. */ callback_arg.proname = NameStr(funcform->proname); - callback_arg.prosrc = NULL; + callback_arg.prosrc = src; sqlerrcontext.callback = sql_inline_error_callback; sqlerrcontext.arg = (void *) &callback_arg; sqlerrcontext.previous = error_context_stack; error_context_stack = &sqlerrcontext; - /* Fetch the function body */ + /* If we have prosqlbody, pay attention to that not prosrc */ tmp = SysCacheGetAttr(PROCOID, func_tuple, - Anum_pg_proc_prosrc, + Anum_pg_proc_prosqlbody, &isNull); - if (isNull) + if (!isNull) { Node *n; List *querytree_list; - tmp = SysCacheGetAttr(PROCOID, func_tuple, Anum_pg_proc_prosqlbody, &isNull); - if (isNull) - elog(ERROR, "null prosrc and prosqlbody for function %u", funcid); - n = stringToNode(TextDatumGetCString(tmp)); if (IsA(n, List)) querytree_list = linitial_node(List, castNode(List, n)); @@ -4354,10 +4359,6 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid, } else { - src = TextDatumGetCString(tmp); - - callback_arg.prosrc = src; - /* * Set up to handle parameters while parsing the function body. We need a * dummy FuncExpr node containing the already-simplified arguments to pass @@ -4658,15 +4659,12 @@ sql_inline_error_callback(void *arg) int syntaxerrposition; /* If it's a syntax error, convert to internal syntax error report */ - if (callback_arg->prosrc) + syntaxerrposition = geterrposition(); + if (syntaxerrposition > 0) { - syntaxerrposition = geterrposition(); - if (syntaxerrposition > 0) - { - errposition(0); - internalerrposition(syntaxerrposition); - internalerrquery(callback_arg->prosrc); - } + errposition(0); + internalerrposition(syntaxerrposition); + internalerrquery(callback_arg->prosrc); } errcontext("SQL function \"%s\" during inlining", callback_arg->proname); @@ -4778,6 +4776,7 @@ inline_set_returning_function(PlannerInfo *root, RangeTblEntry *rte) Oid func_oid; HeapTuple func_tuple; Form_pg_proc funcform; + char *src; Datum tmp; bool isNull; MemoryContext oldcxt; @@ -4886,31 +4885,36 @@ inline_set_returning_function(PlannerInfo *root, RangeTblEntry *rte) ALLOCSET_DEFAULT_SIZES); oldcxt = MemoryContextSwitchTo(mycxt); + /* Fetch the function body */ + tmp = SysCacheGetAttr(PROCOID, + func_tuple, + Anum_pg_proc_prosrc, + &isNull); + if (isNull) + elog(ERROR, "null prosrc for function %u", func_oid); + src = TextDatumGetCString(tmp); + /* * Setup error traceback support for ereport(). This is so that we can * finger the function that bad information came from. */ callback_arg.proname = NameStr(funcform->proname); - callback_arg.prosrc = NULL; + callback_arg.prosrc = src; sqlerrcontext.callback = sql_inline_error_callback; sqlerrcontext.arg = (void *) &callback_arg; sqlerrcontext.previous = error_context_stack; error_context_stack = &sqlerrcontext; - /* Fetch the function body */ + /* If we have prosqlbody, pay attention to that not prosrc */ tmp = SysCacheGetAttr(PROCOID, func_tuple, - Anum_pg_proc_prosrc, + Anum_pg_proc_prosqlbody, &isNull); - if (isNull) + if (!isNull) { Node *n; - tmp = SysCacheGetAttr(PROCOID, func_tuple, Anum_pg_proc_prosqlbody, &isNull); - if (isNull) - elog(ERROR, "null prosrc and prosqlbody for function %u", func_oid); - n = stringToNode(TextDatumGetCString(tmp)); if (IsA(n, List)) querytree_list = linitial_node(List, castNode(List, n)); @@ -4927,12 +4931,6 @@ inline_set_returning_function(PlannerInfo *root, RangeTblEntry *rte) } else { - char *src; - - src = TextDatumGetCString(tmp); - - callback_arg.prosrc = src; - /* * Set up to handle parameters while parsing the function body. We can * use the FuncExpr just created as the input for diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index d0ea489614..e1c65d8dfc 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -12247,7 +12247,7 @@ dumpFunc(Archive *fout, const FuncInfo *finfo) if (fout->remoteVersion >= 140000) appendPQExpBufferStr(query, - "CASE WHEN prosrc IS NULL AND lanname = 'sql' THEN pg_get_function_sqlbody(p.oid) END AS prosqlbody\n"); + "pg_get_function_sqlbody(p.oid) AS prosqlbody\n"); else appendPQExpBufferStr(query, "NULL AS prosqlbody\n"); diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c index fdc2a89085..400b683859 100644 --- a/src/bin/psql/describe.c +++ b/src/bin/psql/describe.c @@ -512,7 +512,7 @@ describeFunctions(const char *functypes, const char *func_pattern, gettext_noop("Language")); if (pset.sversion >= 140000) appendPQExpBuffer(&buf, - ",\n COALESCE(p.prosrc, pg_catalog.pg_get_function_sqlbody(p.oid)) as \"%s\"", + ",\n COALESCE(pg_catalog.pg_get_function_sqlbody(p.oid), p.prosrc) as \"%s\"", gettext_noop("Source code")); else appendPQExpBuffer(&buf, diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 8d58067d03..448d9898cb 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -112,7 +112,7 @@ CATALOG(pg_proc,1255,ProcedureRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81,Proce Oid protrftypes[1] BKI_DEFAULT(_null_) BKI_LOOKUP(pg_type); /* procedure source text */ - text prosrc; + text prosrc BKI_FORCE_NOT_NULL; /* secondary procedure info (can be NULL) */ text probin BKI_DEFAULT(_null_); diff --git a/src/include/executor/functions.h b/src/include/executor/functions.h index dcb8e18437..b56ce26da0 100644 --- a/src/include/executor/functions.h +++ b/src/include/executor/functions.h @@ -17,9 +17,6 @@ #include "nodes/execnodes.h" #include "tcop/dest.h" -/* This struct is known only within executor/functions.c */ -typedef struct SQLFunctionParseInfo *SQLFunctionParseInfoPtr; - /* * Data structure needed by the parser callback hooks to resolve parameter * references during parsing of a SQL function's body. This is separate from @@ -35,6 +32,8 @@ typedef struct SQLFunctionParseInfo Oid collation; /* function's input collation, if known */ } SQLFunctionParseInfo; +typedef SQLFunctionParseInfo *SQLFunctionParseInfoPtr; + extern Datum fmgr_sql(PG_FUNCTION_ARGS); extern SQLFunctionParseInfoPtr prepare_sql_fn_parse_info(HeapTuple procedureTuple, diff --git a/src/test/regress/expected/create_function_3.out b/src/test/regress/expected/create_function_3.out index 477130e620..9f8733f75c 100644 --- a/src/test/regress/expected/create_function_3.out +++ b/src/test/regress/expected/create_function_3.out @@ -290,6 +290,12 @@ CREATE FUNCTION functest_S_xx(x anyarray) RETURNS anyelement LANGUAGE SQL RETURN x[1]; ERROR: SQL function with unquoted function body cannot have polymorphic arguments +-- check reporting of parse-analysis errors +CREATE FUNCTION functest_S_xx(x date) RETURNS boolean + LANGUAGE SQL + RETURN x > 1; +ERROR: operator does not exist: date > integer +HINT: No operator matches the given name and argument types. You might need to add explicit type casts. -- tricky parsing CREATE FUNCTION functest_S_15(x int) RETURNS boolean LANGUAGE SQL diff --git a/src/test/regress/sql/create_function_3.sql b/src/test/regress/sql/create_function_3.sql index 3575ecc693..9a03301621 100644 --- a/src/test/regress/sql/create_function_3.sql +++ b/src/test/regress/sql/create_function_3.sql @@ -191,6 +191,11 @@ CREATE FUNCTION functest_S_xx(x anyarray) RETURNS anyelement LANGUAGE SQL RETURN x[1]; +-- check reporting of parse-analysis errors +CREATE FUNCTION functest_S_xx(x date) RETURNS boolean + LANGUAGE SQL + RETURN x > 1; + -- tricky parsing CREATE FUNCTION functest_S_15(x int) RETURNS boolean LANGUAGE SQL diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c index dc317c83af..e7cb5c65e9 100644 --- a/src/backend/commands/functioncmds.c +++ b/src/backend/commands/functioncmds.c @@ -852,7 +852,9 @@ static void interpret_AS_clause(Oid languageOid, const char *languageName, char *funcname, List *as, Node *sql_body_in, List *parameterTypes, List *inParameterNames, - char **prosrc_str_p, char **probin_str_p, Node **sql_body_out) + char **prosrc_str_p, char **probin_str_p, + Node **sql_body_out, + const char *queryString) { if (!sql_body_in && !as) ereport(ERROR, @@ -929,6 +931,7 @@ interpret_AS_clause(Oid languageOid, const char *languageName, Query *q; ParseState *pstate = make_parsestate(NULL); + pstate->p_sourcetext = queryString; sql_fn_parser_setup(pstate, pinfo); q = transformStmt(pstate, stmt); if (q->commandType == CMD_UTILITY) @@ -947,6 +950,7 @@ interpret_AS_clause(Oid languageOid, const char *languageName, Query *q; ParseState *pstate = make_parsestate(NULL); + pstate->p_sourcetext = queryString; sql_fn_parser_setup(pstate, pinfo); q = transformStmt(pstate, sql_body_in); if (q->commandType == CMD_UTILITY) @@ -954,6 +958,7 @@ interpret_AS_clause(Oid languageOid, const char *languageName, errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("%s is not yet supported in unquoted SQL function body", GetCommandTagName(CreateCommandTag(q->utilityStmt)))); + free_parsestate(pstate); *sql_body_out = (Node *) q; } @@ -1220,7 +1225,8 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt) interpret_AS_clause(languageOid, language, funcname, as_clause, stmt->sql_body, parameterTypes_list, inParameterNames_list, - &prosrc_str, &probin_str, &prosqlbody); + &prosrc_str, &probin_str, &prosqlbody, + pstate->p_sourcetext); /* * Set default values for COST and ROWS depending on other parameters; diff --git a/src/test/regress/expected/create_function_3.out b/src/test/regress/expected/create_function_3.out index 9f8733f75c..e11a7a2dad 100644 --- a/src/test/regress/expected/create_function_3.out +++ b/src/test/regress/expected/create_function_3.out @@ -295,6 +295,8 @@ CREATE FUNCTION functest_S_xx(x date) RETURNS boolean LANGUAGE SQL RETURN x > 1; ERROR: operator does not exist: date > integer +LINE 3: RETURN x > 1; + ^ HINT: No operator matches the given name and argument types. You might need to add explicit type casts. -- tricky parsing CREATE FUNCTION functest_S_15(x int) RETURNS boolean diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c index e7cb5c65e9..73f9fe1f70 100644 --- a/src/backend/commands/functioncmds.c +++ b/src/backend/commands/functioncmds.c @@ -964,13 +964,13 @@ interpret_AS_clause(Oid languageOid, const char *languageName, } /* - * We must put something in prosrc. For the moment, just record an - * empty string. It might be useful to store the original text of the - * CREATE FUNCTION statement --- but to make actual use of that in - * error reports, we'd also have to adjust readfuncs.c to not throw - * away node location fields when reading prosqlbody. + * Use the text of the CREATE FUNCTION statement as prosrc. Note that + * this might be more text than strictly necessary, if CREATE FUNCTION + * is part of a multi-statement string; but trimming it would require + * adjusting the location fields in prosqlbody, which seems like way + * more trouble than the case is worth. */ - *prosrc_str_p = pstrdup(""); + *prosrc_str_p = pstrdup(queryString); /* But we definitely don't need probin. */ *probin_str_p = NULL;
pgsql-hackers by date: