Re: Schema variables - new implementation for Postgres 15 - Mailing list pgsql-hackers
From | Julien Rouhaud |
---|---|
Subject | Re: Schema variables - new implementation for Postgres 15 |
Date | |
Msg-id | 20221017031743.ayyfgppcipoxf2ia@jrouhaud Whole thread Raw |
In response to | Re: Schema variables - new implementation for Postgres 15 (Pavel Stehule <pavel.stehule@gmail.com>) |
Responses |
Re: Schema variables - new implementation for Postgres 15
Re: Schema variables - new implementation for Postgres 15 |
List | pgsql-hackers |
Hi, On Thu, Oct 13, 2022 at 07:41:32AM +0200, Pavel Stehule wrote: > > > I fixed the commit message of 0001 patch. Fixed shadowed variables too. Thanks! > > > > There is a partially open issue, where I and Julien are not sure about a > > solution, and we would like to ask for the community's opinion. I'll send > > this query in separate mail. > > > > rebased with simplified code related to usage of pfree function If anyone is curious the discussion happend at [1]. I looked at the patchset, this time focusing on the LET command. Here at the comments I have for now: - gram.y @@ -11918,6 +11920,7 @@ ExplainableStmt: | CreateMatViewStmt | RefreshMatViewStmt | ExecuteStmt /* by default all are $$=$1 */ + | LetStmt ; (and other similar places) the comment should be kept to the last statement Also, having LetStmt as an ExplainableStmt means it's allowed in a CTE: cte_list: common_table_expr { $$ = list_make1($1); } | cte_list ',' common_table_expr { $$ = lappend($1, $3); } ; common_table_expr: name opt_name_list AS opt_materialized '(' PreparableStmt ')' opt_search_clause opt_cycle_clause And doing so hits this assert in transformWithClause: if (!IsA(cte->ctequery, SelectStmt)) { /* must be a data-modifying statement */ Assert(IsA(cte->ctequery, InsertStmt) || IsA(cte->ctequery, UpdateStmt) || IsA(cte->ctequery, DeleteStmt)); pstate->p_hasModifyingCTE = true; } and I'm assuming it would also fail on this in transformLetStmt: + /* There can't be any outer WITH to worry about */ + Assert(pstate->p_ctenamespace == NIL); I guess it makes sense to be able to explain a LetStmt (or using it in a prepared statement), so it should be properly handled in transformSelectStmt. Also, I don't see any test for a prepared LET statement, this should also be covered. - transformLetStmt: + varid = IdentifyVariable(names, &attrname, ¬_unique); It would be nice to have a comment saying that the lock is acquired here + /* The grammar should have produced a SELECT */ + if (!IsA(selectQuery, Query) || + selectQuery->commandType != CMD_SELECT) + elog(ERROR, "unexpected non-SELECT command in LET command"); I'm wondering if this should be an Assert instead, as the grammar shouldn't produce anything else no matter what how hard a user try. + /* don't allow multicolumn result */ + if (list_length(exprList) != 1) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("expression is not scalar value"), + parser_errposition(pstate, + exprLocation((Node *) exprList)))); This isn't covered by any regression test and it probably should. It can be reached with something like LET myvar = (null::pg_class).*; The error message could also use a bit of improvement. I see that a_expr allows a select statement in parens, but this leads to a sublink which already has all the required protection to gurantee a single column, and a single row at most during execution. This one returns for non-scalar case: subquery must return only one column Maybe use something similar for it, like "expression must return only one column"? Similarly the error message in svariableStartupReceiver could be made more consistent with the related errors: + if (++outcols > 1) + elog(ERROR, "svariable DestReceiver can take only one attribute"); While on svariableReceiver, I see that the current code assumes that the caller did everything right. That's the case right now, but it should still be made more robust in case future code (or extensions) is added. I'm thinking: - svariableState.rows. Currently not really used, should check that one and only one row is received in svariableReceiveSlot and svariableShutdownReceiver (if no row is received the variable won't be reset which should probably always happen once you setup an svariableReceiver) - svariableState.typid, typmod and typlen should be double checked with the given varid in svariableStartupReceiver. - svariableState.varid should be initialized with InvalidOid to avoid undefined behavior is caller forgets to set it. I'm also wondering if SetVariableDestReceiverParams() should have an assert like LockHeldByMe() for the given varid, and maybe an assert that the varid is a session variable, to avoid running a possibly expensive execution that will fail when receiving the slot. I think the function would be better named SetVariableDestReceiverVarid() or something like that. +void +ExecuteLetStmt(ParseState *pstate, + LetStmt *stmt, + ParamListInfo params, + QueryEnvironment *queryEnv, + QueryCompletion *qc) +{ + [...] + /* run the plan to completion */ + ExecutorRun(queryDesc, ForwardScanDirection, 2L, true); Why 2 rows? I'm assuming it's an attempt to detect queries that returns more than 1 row, but it should be documented. Note that as mentioned above the dest receiver currently doesn't check it, so this definitely needs to be fixed. - IdentifyVariable: *attrname can be set even is no variable is identified. I guess that's ok as it avoids useless code, but it should probably be documented in the function header. Also, the API doesn't look ideal. AFAICS the only reason this function doesn't error out in case of ambiguous name is that transformColumnRef may check if a given name shadows a variable when session_variables_ambiguity_warning is set. But since IdentifyVariable returns InvalidOid if the given list of identifiers is ambiguous, it seems that the shadow detection can fail to detect a shadowed reference if multiple variable would shadow the name: # CREATE TYPE ab AS (a integer, b integer); CREATE TYPE # CREATE VARIABLE v_ab AS ab; CREATE VARIABLE # CREATE TABLE v_ab (a integer, b integer); CREATE TABLE # SET session_variables_ambiguity_warning = 1; SET # sELECT v_ab.a FROM v_ab; WARNING: 42702: session variable "v_ab.a" is shadowed LINE 1: select v_ab.a from v_ab; ^ DETAIL: Session variables can be shadowed by columns, routine's variables and routine's arguments with the same name. a --- (0 rows) # CREATE SCHEMA v_ab; CREATE SCHEMA # CREATE VARIABLE v_ab.a AS integer; CREATE VARIABLE # SELECT v_ab.a FROM v_ab; a --- (0 rows) Note that a bit later in transformColumnRef(), not_unique is checked only if the returned varid is valid, which isn't correct as InvalidOid is currently returned if not_unique is set. I think that the error should be raised in IdentifyVariable rather than having every caller check it. I'm not sure how to perfectly handle the session_variables_ambiguity_warning though. Maybe make not_unique optional, and error out if not_unique is null. If not null, set it as necessary and return one of the oid. The only use would be for shadowing detection, and in that case it won't be possible to check if a warning can be avoided as it would be if no amgibuity is found, but that's probably ok. Or maybe instead LookupVariable should have an extra argument to only match variable with a composite type if caller asks to. This would avoid scenarios like: CREATE VARIABLE myvar AS int; SELECT myvar.blabla; ERROR: 42809: type integer is not composite Is that really ok to match a variable here rather than complaining about a missing FROM-clause? + indirection_start = list_length(names) - (attrname ? 1 : 0); + indirection = list_copy_tail(stmt->target, indirection_start); + [...] + if (indirection != NULL) + { + bool targetIsArray; + char *targetName; + + targetName = get_session_variable_name(varid); + targetIsArray = OidIsValid(get_element_type(typid)); + + pstate->p_hasSessionVariables = true; + + coerced_expr = (Expr *) + transformAssignmentIndirection(pstate, + (Node *) param, + targetName, + targetIsArray, + typid, + typmod, + InvalidOid, + indirection, + list_head(indirection), + (Node *) expr, + COERCION_PLPGSQL, + stmt->location); + } I'm not sure why you use this approach rather than just having something like "ListCell *indirection_head", set it to a non-NULL value when needed, and use that (with names) instead. Note that it's also not correct to compare a List to NULL, use NIL instead. - expr_kind_allows_session_variables Even if that's a bit annoying, I think it's better to explicitly put all values there rather than having a default clause. For instance, EXPR_KIND_CYCLE_MARK is currently allowing session variables, which doesn't look ok. It's probably just an error from when the patchset was rebased, but this probably wouldn't happen if you get an error for an unmatched value if you add a new expr kind (which doesn't happen that often). [1] https://www.postgresql.org/message-id/CAFj8pRB2+pVBFsidS-AzhHdZid40OTUspWfXS0vgahHmaWosZQ@mail.gmail.com
pgsql-hackers by date: