Re: SQL:2023 JSON simplified accessor support - Mailing list pgsql-hackers
From | Alexandra Wang |
---|---|
Subject | Re: SQL:2023 JSON simplified accessor support |
Date | |
Msg-id | CAK98qZ35eF+9MZuqR4HNrmebyBFdNiNLiLZHpQPB7S7OUk-DDQ@mail.gmail.com Whole thread Raw |
In response to | Re: SQL:2023 JSON simplified accessor support (jian he <jian.universality@gmail.com>) |
List | pgsql-hackers |
Hi Jian,
I’ve attached v14, which includes only indentation and comment changes
from v13.
from v13.
On Mon, Aug 25, 2025 at 1:10 AM jian he <jian.universality@gmail.com> wrote:
On Sat, Aug 23, 2025 at 3:34 AM Alexandra Wang
<alexandra.wang.oss@gmail.com> wrote:
>
> I don’t understand the question. In the case of an unsupported Node
> type (not an Indices in patch 0001 or 0002), we break out of the loop
> to stop transforming the remaining subscripts. So there won’t be any
> ‘not contiguous’ indirection elements.
>
Sorry, my last message is wrong.
this time, I applied v13-0001 to v13-0005.
and I found some minor issues.....
No problem. Thanks for reviewing.
On Mon, Aug 25, 2025 at 1:10 AM jian he <jian.universality@gmail.com> wrote:
v13-0002-Allow-Generic-Type-Subscripting-to-Accept-Dot-No.patch
+ /*
+ * Error out, if datatype failed to consume any indirection elements.
+ */
+ if (list_length(*indirection) == indirection_length)
+ {
+ Node *ind = linitial(*indirection);
+
+ if (noError)
+ return NULL;
+
+ if (IsA(ind, String))
+ ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("type %s does not support dot notation",
+ format_type_be(containerType)),
+ parser_errposition(pstate, exprLocation(containerBase))));
+ else if (IsA(ind, A_Indices))
+ ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("type %s does not support array subscripting",
+ format_type_be(containerType)),
+ parser_errposition(pstate, exprLocation(containerBase))));
+ else
+ elog(ERROR, "invalid indirection operation: %d", nodeTag(ind));
+ }
these error message still not reached after I apply
v13-0005-Implement-read-only-dot-notation-for-jsonb.patch
maybe we can simply do
+ if (noError)
+ return NULL;
+
+ ereport(ERROR,
+ errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("type %s does not support subscripting",
+ format_type_be(containerType)),
+ parser_errposition(pstate, exprLocation(containerBase)));
?
The regression tests do not currently trigger these ERROR cases
because Assignment for dot-notation has not yet been implemented. At
present, the noError argument of transformContainerSubscripts() is set
to true for SELECT and false for UPDATE. As a result, the
ereport(ERROR) calls are never reached in SELECT queries. This patch
series also does not aim to implement dot-notation for UPDATE
statements that assign to a jsonb field. We have therefore made only
minimal adjustments in transformAssignmentIndirection(). Since the
assignment code path is separate, it likewise does not reach the
ereport(ERROR) cases with in-core data types. Once dot-notation
support for Assignment is added, those ERROR messages will become
relevant.
because Assignment for dot-notation has not yet been implemented. At
present, the noError argument of transformContainerSubscripts() is set
to true for SELECT and false for UPDATE. As a result, the
ereport(ERROR) calls are never reached in SELECT queries. This patch
series also does not aim to implement dot-notation for UPDATE
statements that assign to a jsonb field. We have therefore made only
minimal adjustments in transformAssignmentIndirection(). Since the
assignment code path is separate, it likewise does not reach the
ereport(ERROR) cases with in-core data types. Once dot-notation
support for Assignment is added, those ERROR messages will become
relevant.
That said, the ERROR messages are not "dead code." They can still be
triggered when working with custom data types. For example, you can
define a custom array-like type by copying the in-core array type and
slightly modifying its *_subscript_transform() function. For now, you
can simply update array_subscript_transform() in-place with the
following diff:
triggered when working with custom data types. For example, you can
define a custom array-like type by copying the in-core array type and
slightly modifying its *_subscript_transform() function. For now, you
can simply update array_subscript_transform() in-place with the
following diff:
--- a/src/backend/utils/adt/arraysubs.c
+++ b/src/backend/utils/adt/arraysubs.c
@@ -82,6 +82,8 @@ array_subscript_transform(SubscriptingRef *sbsref,
ai = lfirst_node(A_Indices, idx);
+ if (isSlice)
+ break;
if (isSlice)
{
if (ai->lidx)
+++ b/src/backend/utils/adt/arraysubs.c
@@ -82,6 +82,8 @@ array_subscript_transform(SubscriptingRef *sbsref,
ai = lfirst_node(A_Indices, idx);
+ if (isSlice)
+ break;
if (isSlice)
{
if (ai->lidx)
This way, we have a *_subscript_transform function that you can
register for a custom array-like data type that does not handle
slicing. Then you can hit the error message like this:
register for a custom array-like data type that does not handle
slicing. Then you can hit the error message like this:
test=# update t set arr[1:] = 1;
ERROR: 42804: type integer[] does not support array subscripting
LINE 1: update t set arr[1:] = 1;
^
LOCATION: transformContainerSubscripts, parse_node.c:345
ERROR: 42804: type integer[] does not support array subscripting
LINE 1: update t set arr[1:] = 1;
^
LOCATION: transformContainerSubscripts, parse_node.c:345
test=# \d t
Table "public.t"
Column | Type | Collation | Nullable | Default
--------+-----------+-----------+----------+---------
arr | integer[] | | |
Table "public.t"
Column | Type | Collation | Nullable | Default
--------+-----------+-----------+----------+---------
arr | integer[] | | |
particular (since one might expect something about slicing), but it
still demonstrates how this code path is relevant. In the meantime, I
don’t think it’s worth adding a new data type to exercise this error
message, given that we are likely to make changes to the assignment
code path for in-core types.
On Mon, Aug 25, 2025 at 1:10 AM jian he <jian.universality@gmail.com> wrote:
for V13-0004.
in master we have
if (subExpr == NULL)
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("jsonb subscript must have text type"),
parser_errposition(pstate, exprLocation(subExpr))));
but this part is removed from v13-0004-Extract-coerce_jsonpath_subscript
?
I removed this one because you asked me to (see our past conversation
for v11).
On Wed, Jul 9, 2025 at 1:01 AM Alexandra Wang <alexandra.wang.oss@gmail.com> wrote:
On Tue, Jun 24, 2025 at 10:57 PM jian he <jian.universality@gmail.com> wrote:v11-0004-Extract-coerce_jsonpath_subscript.patch
+ /*
+ * We known from can_coerce_type that coercion will succeed, so
+ * coerce_type could be used. Note the implicit coercion context, which is
+ * required to handle subscripts of different types, similar to overloaded
+ * functions.
+ */
+ subExpr = coerce_type(pstate,
+ subExpr, subExprType,
+ targetType, -1,
+ COERCION_IMPLICIT,
+ COERCE_IMPLICIT_CAST,
+ -1);
+ if (subExpr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("jsonb subscript must have text type"),
the targetType can be "integer", then the error message
errmsg("jsonb subscript must have text type") would be wrong?
Also this error handling is not necessary.
since we can_coerce_type already tell us that coercion will succeed.
Also, do we really put v11-0004 as a separate patch?Thanks for pointing this out! I’ve removed the unnecessary error
handling as you suggested. I kept v12-0004 as a separate patch to make
the review easier, but I’ll leave it up to the committer to decide
whether to squash it with the other commits.
one option so we don’t keep going back and forth.
On Mon, Aug 25, 2025 at 1:10 AM jian he <jian.universality@gmail.com> wrote:
coerce_jsonpath_subscript_to_int4_or_text
maybe we can just rename to
coerce_jsonpath_subscript,
then add some comments explaining why currently only support result node coerce
to text or int4 data type.
What is the reason for renaming this function to a less explicit name?
It was previously called coerce_jsonpath_subscript() because we passed
INT4OID as an argument to specify the target type. I followed your
earlier review comment (for v12) to remove that argument and hard-code
INT4OID within the function. Given that change, I think it’s clearer
to keep a more explicit name that describes what the function
achieves.
It was previously called coerce_jsonpath_subscript() because we passed
INT4OID as an argument to specify the target type. I followed your
earlier review comment (for v12) to remove that argument and hard-code
INT4OID within the function. Given that change, I think it’s clearer
to keep a more explicit name that describes what the function
achieves.
On Mon, Aug 25, 2025 at 1:10 AM jian he <jian.universality@gmail.com> wrote:
for v13-0005-Implement-read-only-dot-notation-for-jsonb.patch
+ i = 0;
+ foreach(lc, sbsref->refupperindexpr) {
+ Expr *e = (Expr *) lfirst(lc);
+
+ /* When slicing, individual subscript bounds can be omitted */
+ if (!e) {
+ sbsrefstate->upperprovided[i] = false;
+ sbsrefstate->upperindexnull[i] = true;
+ } else {
+ sbsrefstate->upperprovided[i] = true;
+ /* Each subscript is evaluated into appropriate array entry */
+ ExecInitExprRec(e, state,
+ &sbsrefstate->upperindex[i],
+ &sbsrefstate->upperindexnull[i]);
+ }
+ i++;
}
curly braces should be put into the next new line.
Fixed. Thank you!
On Mon, Aug 25, 2025 at 1:10 AM jian he <jian.universality@gmail.com> wrote:
there are two
+ if (!sbsref->refjsonbpath)
+{
+}
maybe we can put these two together.
reduce readability in my opinion.
On Mon, Aug 25, 2025 at 1:10 AM jian he <jian.universality@gmail.com> wrote:
+SELECT (jb)['b']['x'].z FROM test_jsonb_dot_notation; -- returns NULL
with warnings
+ z
+---
+
+(1 row)
there is no warning, "returns NULL with warnings" should be removed.
Fixed. Thank you!
On Mon, Aug 25, 2025 at 1:10 AM jian he <jian.universality@gmail.com> wrote:
+-- clean up
+DROP TABLE test_jsonb_dot_notation;
\ No newline at end of file
to remove "\ No newline at end of file"
you need to add a newline to jsonb.sql, you may see other regress sql
test files.
Fixed. Thank you!
On Mon, Aug 25, 2025 at 1:10 AM jian he <jian.universality@gmail.com> wrote:
+ foreach(lc, *indirection)
+ if()
+ {
+ }
+ else
+ {
+ /*
+ * Unexpected node type in indirection list. This should not
+ * happen with current grammar, but we handle it defensively by
+ * breaking out of the loop rather than crashing. In case of
+ * future grammar changes that might introduce new node types,
+ * this allows us to create a jsonpath from as many indirection
+ * elements as we can and let transformIndirection() fallback to
+ * alternative logic to handle the remaining indirection elements.
+ */
+ Assert(false); /* not reachable */
+ break;
+ }
+
+ /* append path item */
+ path->next = jpi;
+ path = jpi;
+ pathlen++;
If the above else branch is reached, it will crash due to `Assert(false);`,
which contradicts the preceding comments.
to make the comments accurate, we need remove
" Assert(false); /* not reachable */"
?
The Assert is intentional. We want to fail fast for the “not
reachable” case in the development build, while breaking out of the
loop in the production build.
reachable” case in the development build, while breaking out of the
loop in the production build.
On Mon, Aug 25, 2025 at 1:10 AM jian he <jian.universality@gmail.com> wrote:
/*
- * Transform and convert the subscript expressions. Jsonb subscripting
- * does not support slices, look only at the upper index.
+ * We would only reach here if json simplified accessor is not needed, or
+ * if jsonb_subscript_make_jsonpath() didn't consume any indirection
+ * element — either way, the first indirection element could not be
+ * converted into a JsonPath component. This happens when it's a non-slice
+ * A_Indices with a non-integer upper index. The code below falls back to
+ * traditional jsonb subscripting for such cases.
*/
the above comments, "non-integer" part is wrong?
For example, the below two SELECTs both use "traditional" jsonb
subscripting logic.
CREATE TABLE ts1 AS SELECT '[1, [2]]' ::jsonb jb;
SELECT (jb)[1] FROM ts1;
SELECT (jb)['a'] FROM ts1;
I updated the comment. In the updated comment, I also chose to use
“pre-standard” json subscripting instead of “traditional” json
subscripting. Hope that helps!
“pre-standard” json subscripting instead of “traditional” json
subscripting. Hope that helps!
Best,
Alex
Attachment
- v14-0002-Allow-Generic-Type-Subscripting-to-Accept-Dot-No.patch
- v14-0003-Export-jsonPathFromParseResult.patch
- v14-0001-Allow-transformation-of-only-a-sublist-of-subscr.patch
- v14-0005-Implement-read-only-dot-notation-for-jsonb.patch
- v14-0007-Implement-jsonb-wildcard-member-accessor.patch
- v14-0006-Implement-Jsonb-subscripting-with-slicing.patch
- v14-0004-Extract-coerce_jsonpath_subscript.patch
pgsql-hackers by date: