Re: remaining sql/json patches - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: remaining sql/json patches |
Date | |
Msg-id | CA+HiwqGOKvjSD1soND26-eRqJyGsUWPc5mG2L4D7x85aOJsJ0Q@mail.gmail.com Whole thread Raw |
In response to | Re: remaining sql/json patches (jian he <jian.universality@gmail.com>) |
Responses |
Re: remaining sql/json patches
|
List | pgsql-hackers |
Hi Jian, Thanks for your time on this. On Mon, Apr 1, 2024 at 9:00 AM jian he <jian.universality@gmail.com> wrote: > SELECT * FROM JSON_TABLE('[]', 'strict $.a' COLUMNS (js2 text PATH > '$' error on empty error on error) EMPTY ON ERROR); > should i expect it return one row? > is there any example to make it return one row from top level "EMPTY ON ERROR"? I think that's expected. You get 0 rows instead of a single row with one column containing an empty array, because the NULL returned by the error-handling part of JSON_TABLE's top-level path is not returned directly to the user, but instead passed as an input document for the TableFunc. I think it suffices to add a note to the documentation of table-level (that is, not column-level) ON ERROR clause that EMPTY means an empty "table", not empty array, which is what you get with JSON_QUERY(). > + { > + JsonTablePlan *scan = (JsonTablePlan *) plan; > + > + JsonTableInitPathScan(cxt, planstate, args, mcxt); > + > + planstate->nested = scan->child ? > + JsonTableInitPlan(cxt, scan->child, planstate, args, mcxt) : NULL; > + } > first line seems strange, do we just simply change from "plan" to "scan"? Mostly to complement the "join" variable in the other block. Anyway, I've reworked this to make JsonTablePlan an abstract struct and make JsonTablePathScan and JsonTableSiblingJoin "inherit" from it. > + case JTC_REGULAR: > + typenameTypeIdAndMod(pstate, rawc->typeName, &typid, &typmod); > + > + /* > + * Use implicit FORMAT JSON for composite types (arrays and > + * records) or if a non-default WRAPPER / QUOTES behavior is > + * specified. > + */ > + if (typeIsComposite(typid) || > + rawc->quotes != JS_QUOTES_UNSPEC || > + rawc->wrapper != JSW_UNSPEC) > + rawc->coltype = JTC_FORMATTED; > per previous discussion, should we refactor the above comment? Done. Instead of saying "use implicit FORMAT JSON" I've reworked the comment to mention instead that we do this so that the column uses JSON_QUERY() as implementation for these cases. > +/* Recursively set 'reset' flag of planstate and its child nodes */ > +static void > +JsonTablePlanReset(JsonTablePlanState *planstate) > +{ > + if (IsA(planstate->plan, JsonTableSiblingJoin)) > + { > + JsonTablePlanReset(planstate->left); > + JsonTablePlanReset(planstate->right); > + planstate->advanceRight = false; > + } > + else > + { > + planstate->reset = true; > + planstate->advanceNested = false; > + > + if (planstate->nested) > + JsonTablePlanReset(planstate->nested); > + } > per coverage, the first part of the IF branch never executed. > i also found out that JsonTablePlanReset is quite similar to JsonTableRescan, > i don't fully understand these two functions though. Working on improving the documentation of the recursive algorithm, though I want to focus on finishing 0001 first. > SELECT * FROM JSON_TABLE(jsonb'{"a": {"z":[1111]}, "b": 1,"c": 2, "d": > 91}', '$' COLUMNS ( > c int path '$.c', > d int path '$.d', > id1 for ordinality, > NESTED PATH '$.a.z[*]' columns (z int path '$', id for ordinality) > )); > doc seems to say that duplicated ordinality columns in different nest > levels are not allowed? Both the documentation and the code in JsonTableGetValue() to calculate a FOR ORDINALITY column were wrong. A nested path's columns should be able to have its own ordinal counter that runs separately from the other paths, including the parent path, all the way up to the root path. I've fixed both. Added a test case too. > "currentRow" naming seems misleading, generally, when we think of "row", > we think of several (not one) datums, or several columns. > but here, we only have one datum. > I don't have good optional naming though. Yeah, I can see the confusion. I've created a new struct called JsonTablePlanRowSource and different places now use a variable named just 'current' to refer to the currently active row source. It's hopefully clear from the context that the datum containing the JSON object is acting as a source of values for evaluating column paths. > + case JTC_FORMATTED: > + case JTC_EXISTS: > + { > + Node *je; > + CaseTestExpr *param = makeNode(CaseTestExpr); > + > + param->collation = InvalidOid; > + param->typeId = contextItemTypid; > + param->typeMod = -1; > + > + je = transformJsonTableColumn(rawc, (Node *) param, > + NIL, errorOnError); > + > + colexpr = transformExpr(pstate, je, EXPR_KIND_FROM_FUNCTION); > + assign_expr_collations(pstate, colexpr); > + > + typid = exprType(colexpr); > + typmod = exprTypmod(colexpr); > + break; > + } > + > + default: > + elog(ERROR, "unknown JSON_TABLE column type: %d", rawc->coltype); > + break; > + } > + > + tf->coltypes = lappend_oid(tf->coltypes, typid); > + tf->coltypmods = lappend_int(tf->coltypmods, typmod); > + tf->colcollations = lappend_oid(tf->colcollations, get_typcollation(typid)); > + tf->colvalexprs = lappend(tf->colvalexprs, colexpr); > > why not use exprCollation(colexpr) for tf->colcollations, similar to > exprType(colexpr)? Yes, maybe. On Tue, Apr 2, 2024 at 3:54 PM jian he <jian.universality@gmail.com> wrote: > +/* > + * Recursively transform child JSON_TABLE plan. > + * > + * Default plan is transformed into a cross/union join of its nested columns. > + * Simple and outer/inner plans are transformed into a JsonTablePlan by > + * finding and transforming corresponding nested column. > + * Sibling plans are recursively transformed into a JsonTableSibling. > + */ > +static Node * > +transformJsonTableChildPlan(JsonTableParseContext *cxt, > + List *columns) > this comment is not the same as the function intention for now. > maybe we need to refactor it. Fixed. > /* > * Each call to fetch a new set of rows - of which there may be very many > * if XMLTABLE is being used in a lateral join - will allocate a possibly > * substantial amount of memory, so we cannot use the per-query context > * here. perTableCxt now serves the same function as "argcontext" does in > * FunctionScan - a place to store per-one-call (i.e. one result table) > * lifetime data (as opposed to per-query or per-result-tuple). > */ > MemoryContextSwitchTo(tstate->perTableCxt); > > maybe we can replace "XMLTABLE" to "XMLTABLE or JSON_TABLE"? Good catch, done. > > /* Transform and coerce the PASSING arguments to to jsonb. */ > there should be only one "to"? Will need to fix that separately. > ----------------------------------------------------------------------------------------------------------------------- > json_table_column clause doesn't have a passing clause. > we can only have one passing clause in json_table. > but during JsonTableInitPathScan, for each output columns associated > JsonTablePlanState > we already initialized the PASSING arguments via `planstate->args = args;` > also transformJsonTableColumn already has a passingArgs argument. > technically we can use the jsonpath variable for every output column > regardless of whether it's nested or not. > > JsonTable already has the "passing" clause, > we just need to pass it to function transformJsonTableColumns and it's callees. > based on that, I implemented it. seems quite straightforward. > I also wrote several contrived, slightly complicated tests. > It seems to work just fine. > > simple explanation: > previously the following sql will fail, error message is that "could > not find jsonpath variable %s". > now it will work. > > SELECT sub.* FROM > JSON_TABLE(jsonb '{"a":{"za":[{"z1": [11,2222]},{"z21": [22, > 234,2345]}]},"c": 3}', > '$' PASSING 22 AS x, 234 AS y > COLUMNS( > xx int path '$.c', > NESTED PATH '$.a.za[1]' as n1 columns > (NESTED PATH '$.z21[*]' as n2 > COLUMNS (z21 int path '$?(@ == $"x" || @ == $"y" )' default 0 on empty)), > NESTED PATH '$.a.za[0]' as n4 columns > (NESTED PATH '$.z1[*]' as n3 > COLUMNS (z1 int path '$?(@ > $"y" + 1988)' default 0 on empty))) > )sub; Thanks for the patch. Yeah, not allowing column paths (including nested ones) to use top-level PASSING args seems odd, so I wanted to fix it too. Please let me know if you have further comments on 0001. I'd like to get that in before spending more energy on 0002. -- Thanks, Amit Langote
Attachment
pgsql-hackers by date: