Re: SQL/JSON revisited - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: SQL/JSON revisited |
Date | |
Msg-id | CA+HiwqGemKf_CjExy+sfNYv8eu53h_zX=F_ia63DNOvAJrGiGg@mail.gmail.com Whole thread Raw |
In response to | Re: SQL/JSON revisited (Andres Freund <andres@anarazel.de>) |
Responses |
Re: SQL/JSON revisited
Re: SQL/JSON revisited |
List | pgsql-hackers |
Thanks a lot for taking a look at this and sorry about the delay in response. On Tue, Feb 21, 2023 at 2:25 AM Andres Freund <andres@anarazel.de> wrote: > On 2023-02-20 16:35:52 +0900, Amit Langote wrote: > > Subject: [PATCH v4 03/10] SQL/JSON query functions > > +/* > > + * Evaluate a JSON error/empty behavior result. > > + */ > > +static Datum > > +ExecEvalJsonBehavior(JsonBehavior *behavior, bool *is_null) > > +{ > > + *is_null = false; > > + > > + switch (behavior->btype) > > + { > > + case JSON_BEHAVIOR_EMPTY_ARRAY: > > + return JsonbPGetDatum(JsonbMakeEmptyArray()); > > + > > + case JSON_BEHAVIOR_EMPTY_OBJECT: > > + return JsonbPGetDatum(JsonbMakeEmptyObject()); > > + > > + case JSON_BEHAVIOR_TRUE: > > + return BoolGetDatum(true); > > + > > + case JSON_BEHAVIOR_FALSE: > > + return BoolGetDatum(false); > > + > > + case JSON_BEHAVIOR_NULL: > > + case JSON_BEHAVIOR_UNKNOWN: > > + *is_null = true; > > + return (Datum) 0; > > + > > + case JSON_BEHAVIOR_DEFAULT: > > + /* Always handled in the caller. */ > > + Assert(false); > > + return (Datum) 0; > > + > > + default: > > + elog(ERROR, "unrecognized SQL/JSON behavior %d", behavior->btype); > > + return (Datum) 0; > > + } > > +} > > Does this actually need to be evaluated at expression eavluation time? > Couldn't we just emit the proper constants in execExpr.c? Yes, done that way in the updated patch. > > +/* ---------------------------------------------------------------- > > + * ExecEvalJson > > + * ---------------------------------------------------------------- > > + */ > > +void > > +ExecEvalJson(ExprState *state, ExprEvalStep *op, ExprContext *econtext) > > Pointless comment. Removed this function altogether in favor of merging the body with ExecEvalJsonExpr(), which does have a more sensible comment > > +{ > > + JsonExprState *jsestate = op->d.jsonexpr.jsestate; > > + JsonExprPreEvalState *pre_eval = &jsestate->pre_eval; > > + JsonExprPostEvalState *post_eval = &jsestate->post_eval; > > + JsonExpr *jexpr = jsestate->jsexpr; > > + Datum item; > > + Datum res = (Datum) 0; > > + JsonPath *path; > > + bool throwErrors = jexpr->on_error->btype == JSON_BEHAVIOR_ERROR; > > + > > + *op->resnull = true; /* until we get a result */ > > + *op->resvalue = (Datum) 0; > > + > > + item = pre_eval->formatted_expr.value; > > + path = DatumGetJsonPathP(pre_eval->pathspec.value); > > + > > + /* Reset JsonExprPostEvalState for this evaluation. */ > > + memset(post_eval, 0, sizeof(*post_eval)); > > + > > + res = ExecEvalJsonExpr(op, econtext, path, item, op->resnull, > > + !throwErrors ? &post_eval->error : NULL); > > + > > + *op->resvalue = res; > > +} > > I really don't like having both ExecEvalJson() and ExecEvalJsonExpr(). There's > really no way to know what which version does, just based on the name. Yes, having two functions is no longer necessary. > > --- a/src/backend/parser/gram.y > > +++ b/src/backend/parser/gram.y > > This stuff adds quite a bit of complexity to the parser. Do we realy need like > a dozen new rules here? >> > > +json_behavior_empty_array: > > + EMPTY_P ARRAY { $$ = makeJsonBehavior(JSON_BEHAVIOR_EMPTY_ARRAY, NULL); } > > + /* non-standard, for Oracle compatibility only */ > > + | EMPTY_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_EMPTY_ARRAY, NULL); } > > + ; > > Do we really want to add random oracle compat crud here? Hmm, sorry, but I haven't familiarized myself with the grammar side of things as much as I perhaps should have, so I am not sure whether a more simplified grammar would suffice for offering a standard-compliant functionality. Maybe we could take out the oracle-compatibility bit, but I'd appreciate it if someone who has been involved with SQL/JSON from the beginning can comment on the above 2 points. > > +/* > > + * Evaluate a JSON path variable caching computed value. > > + */ > > +int > > +EvalJsonPathVar(void *cxt, char *varName, int varNameLen, > > + JsonbValue *val, JsonbValue *baseObject) > > Missing static? Fixed. > > +{ > > + JsonPathVariableEvalContext *var = NULL; > > + List *vars = cxt; > > + ListCell *lc; > > + int id = 1; > > + > > + if (!varName) > > + return list_length(vars); > > + > > + foreach(lc, vars) > > + { > > + var = lfirst(lc); > > + > > + if (!strncmp(var->name, varName, varNameLen)) > > + break; > > + > > + var = NULL; > > + id++; > > + } > > + > > + if (!var) > > + return -1; > > + > > + /* > > + * When belonging to a JsonExpr, path variables are computed with the > > + * JsonExpr's ExprState (var->estate is NULL), so don't need to be computed > > + * here. In some other cases, such as when the path variables belonging > > + * to a JsonTable instead, those variables must be evaluated on their own, > > + * without the enclosing JsonExpr itself needing to be evaluated, so must > > + * be handled here. > > + */ > > + if (var->estate && !var->evaluated) > > + { > > + Assert(var->econtext != NULL); > > + var->value = ExecEvalExpr(var->estate, var->econtext, &var->isnull); > > + var->evaluated = true; > > Uh, so this continues to do recursive expression evaluation, as > ExecEvalJsonExpr()->JsonPathQuery()->executeJsonPath(EvalJsonPathVar) > > I'm getting grumpy here. This is wrong, has been pointed out many times. The > only thing that changes is that the point of recursion is moved around. Actually, these JSON path vars, along with other sub-expressions of JsonExpr, *are* computed non-recursively as ExprEvalSteps of the JsonExprState, at least in the cases where the vars are to be computed as part of evaluating the JsonExpr itself. So, the code path you've shown above perhaps as a hypothetical doesn't really exist, though there *is* an instance where these path vars are computed *outside* the context of evaluating the parent JsonExpr, such as in JsonTableResetContextItem(). Maybe there's a cleaner way of doing that though... > > + > > +/* > > + * ExecEvalExprSafe > > + * > > + * Like ExecEvalExpr(), though this allows the caller to pass an > > + * ErrorSaveContext to declare its intenion to catch any errors that occur when > > + * executing the expression, such as when calling type input functions that may > > + * be present in it. > > + */ > > +static inline Datum > > +ExecEvalExprSafe(ExprState *state, > > + ExprContext *econtext, > > + bool *isNull, > > + Node *escontext, > > + bool *error) > > Afaict there's no caller of this? Oops, removed. This was used in a previous version of the patch that still had nested ExprStates inside JsonExprState. > > +/* > > + * ExecInitExprWithCaseValue > > + * > > + * This is the same as ExecInitExpr, except the caller passes the Datum and > > + * bool pointers that it would like the ExprState.innermost_caseval > > + * and ExprState.innermost_casenull, respectively, to be set to. That way, > > + * it can pass an input value to evaluate the expression via a CaseTestExpr. > > + */ > > +ExprState * > > +ExecInitExprWithCaseValue(Expr *node, PlanState *parent, > > + Datum *caseval, bool *casenull) > > +{ > > + ExprState *state; > > + ExprEvalStep scratch = {0}; > > + > > + /* Special case: NULL expression produces a NULL ExprState pointer */ > > + if (node == NULL) > > + return NULL; > > + > > + /* Initialize ExprState with empty step list */ > > + state = makeNode(ExprState); > > + state->expr = node; > > + state->parent = parent; > > + state->ext_params = NULL; > > + state->innermost_caseval = caseval; > > + state->innermost_casenull = casenull; > > + > > + /* Insert EEOP_*_FETCHSOME steps as needed */ > > + ExecInitExprSlots(state, (Node *) node); > > + > > + /* Compile the expression proper */ > > + ExecInitExprRec(node, state, &state->resvalue, &state->resnull); > > + > > + /* Finally, append a DONE step */ > > + scratch.opcode = EEOP_DONE; > > + ExprEvalPushStep(state, &scratch); > > + > > + ExecReadyExpr(state); > > + > > + return state; > > > +struct JsonTableJoinState > > +{ > > + union > > + { > > + struct > > + { > > + JsonTableJoinState *left; > > + JsonTableJoinState *right; > > + bool advanceRight; > > + } join; > > + JsonTableScanState scan; > > + } u; > > + bool is_join; > > +}; > > A join state that unions the join member with a scan, and has a is_join field? Yeah, I agree that's not the best form for what it is. I've replaced that with the following: +/* Structures for JSON_TABLE execution */ + +typedef enum JsonTablePlanStateType +{ + JSON_TABLE_SCAN_STATE = 0, + JSON_TABLE_JOIN_STATE +} JsonTablePlanStateType; + +typedef struct JsonTablePlanState +{ + JsonTablePlanStateType type; + + struct JsonTablePlanState *parent; + struct JsonTablePlanState *nested; +} JsonTablePlanState; + +typedef struct JsonTableScanState +{ + JsonTablePlanState plan; + + MemoryContext mcxt; + JsonPath *path; + List *args; + JsonValueList found; + JsonValueListIterator iter; + Datum current; + int ordinal; + bool currentIsNull; + bool outerJoin; + bool errorOnError; + bool advanceNested; + bool reset; +} JsonTableScanState; + +typedef struct JsonTableJoinState +{ + JsonTablePlanState plan; + + JsonTablePlanState *left; + JsonTablePlanState *right; + bool advanceRight; +} JsonTableJoinState; I considered using NodeTag but decided not to, because this stuff is local to jsonpath_exec.c. > > + * JsonTableInitOpaque > > + * Fill in TableFuncScanState->opaque for JsonTable processor > > + */ > > +static void > > +JsonTableInitOpaque(TableFuncScanState *state, int natts) > > +{ > > + JsonTableContext *cxt; > > + PlanState *ps = &state->ss.ps; > > + TableFuncScan *tfs = castNode(TableFuncScan, ps->plan); > > + TableFunc *tf = tfs->tablefunc; > > + JsonExpr *ci = castNode(JsonExpr, tf->docexpr); > > + JsonTableParent *root = castNode(JsonTableParent, tf->plan); > > + List *args = NIL; > > + ListCell *lc; > > + int i; > > + > > + cxt = palloc0(sizeof(JsonTableContext)); > > + cxt->magic = JSON_TABLE_CONTEXT_MAGIC; > > + > > + if (ci->passing_values) > > + { > > + ListCell *exprlc; > > + ListCell *namelc; > > + > > + forboth(exprlc, ci->passing_values, > > + namelc, ci->passing_names) > > + { > > + Expr *expr = (Expr *) lfirst(exprlc); > > + String *name = lfirst_node(String, namelc); > > + JsonPathVariableEvalContext *var = palloc(sizeof(*var)); > > + > > + var->name = pstrdup(name->sval); > > + var->typid = exprType((Node *) expr); > > + var->typmod = exprTypmod((Node *) expr); > > + var->estate = ExecInitExpr(expr, ps); > > + var->econtext = ps->ps_ExprContext; > > + var->mcxt = CurrentMemoryContext; > > + var->evaluated = false; > > + var->value = (Datum) 0; > > + var->isnull = true; > > + > > + args = lappend(args, var); > > + } > > + } > > + > > + cxt->colexprs = palloc(sizeof(*cxt->colexprs) * > > + list_length(tf->colvalexprs)); > > + > > + JsonTableInitScanState(cxt, &cxt->root, root, NULL, args, > > + CurrentMemoryContext); > > + > > + i = 0; > > + > > + foreach(lc, tf->colvalexprs) > > + { > > + Expr *expr = lfirst(lc); > > + > > + cxt->colexprs[i].expr = > > + ExecInitExprWithCaseValue(expr, ps, > > + &cxt->colexprs[i].scan->current, > > + &cxt->colexprs[i].scan->currentIsNull); > > + > > + i++; > > + } > > + > > + state->opaque = cxt; > > +} > > Why don't you just emit the proper expression directly, insted of the > CaseTestExpr stuff, that you then separately evaluate? I suppose you mean emitting the expression that supplies the value given by scan->current and scan->currentIsNull into the same ExprState that holds the steps for a given colvalexpr. If so, I don't really see a way of doing that given the current model of JSON_TABLE execution. The former is computed as part of TableFuncRoutine.FetchRow(scan), which sets scan.current (and currentIsNull) and the letter is computer as part of TableFuncRoutine.GetValue(scan, colnum). > Evaluating N expressions for a json table isn't a good approach, both memory > and CPU efficiency wise. Are you referring to JsonTableInitOpaque() initializing all these sub-expressions of JsonTableParent, especially colvalexprs, using N *independent* ExprStates? That could perhaps be made to work by making JsonTableParent be an expression recognized by execExpr.c, so that a single ExprState can store the steps for all its sub-expressions, much like JsonExpr is. I'll give that a try, though I wonder if the semantics of making this work in a single ExecEvalExpr() call will mismatch that of the current way, because different sub-expressions are currently evaluated under different APIs of TableFuncRoutine. In the meantime, I'm attaching a version of the patchset with a few things fixed as mentioned above. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Attachment
- v6-0009-Claim-SQL-standard-compliance-for-SQL-JSON-featur.patch
- v6-0007-PLAN-clauses-for-JSON_TABLE.patch
- v6-0005-RETURNING-clause-for-JSON-and-JSON_SCALAR.patch
- v6-0008-Documentation-for-SQL-JSON-features.patch
- v6-0006-JSON_TABLE.patch
- v6-0004-SQL-JSON-functions.patch
- v6-0003-SQL-JSON-query-functions.patch
- v6-0001-SQL-JSON-constructors.patch
- v6-0002-IS-JSON-predicate.patch
pgsql-hackers by date: