Thread: Re: SQL:2023 JSON simplified accessor support
On 29.08.24 18:33, Alexandra Wang wrote: > I’ve implemented the member and array accessors and attached two > alternative patches: > > 1. v1-0001-Add-JSON-JSONB-simplified-accessor.patch: This patch > enables dot access to JSON object fields and subscript access to > indexed JSON array elements by converting "." and "[]" indirection > into a JSON_QUERY JsonFuncExpr node. > > 2. v2-0001-Transform-JSON-dot-access-to-arrow-operator.txt: This > alternative patch implements dot access to JSON object fields by > transforming the "." indirection into a "->" operator. > > The upside of the v1 patch is that it strictly aligns with the SQL > standard, which specifies that the simplified access is equivalent to: > > JSON_QUERY (VEP, 'lax $.JC' WITH CONDITIONAL ARRAY WRAPPER NULL ON > EMPTY NULL ON ERROR) > > However, the performance of JSON_QUERY might be suboptimal due to > function call overhead. Therefore, I implemented the v2 alternative > using the "->" operator. Using the operator approach would also allow taking advantage of optimizations such as <https://www.postgresql.org/message-id/flat/CAKU4AWoqAVya6PBhn%2BBCbFaBMt3z-2%3Di5fKO3bW%3D6HPhbid2Dw%40mail.gmail.com>. > There is some uncertainty about the semantics of conditional array > wrappers. Currently, there is at least one subtle difference between > the "->" operator and JSON_QUERY, as shown: That JSON_QUERY bug has been fixed. I suggest you rebase both of your patches over this, just to double check everything. But then I think you can drop the v1 patch and just submit a new version of v2. The patch should eventually contain some documentation. It might be good starting to look for a good spot where to put that documentation. It might be either near the json types documentation or near the general qualified identifier syntax, not sure.
On 2024-09-26 Th 11:45 AM, Alexandra Wang wrote: > Hi, > > I didn’t run pgindent earlier, so here’s the updated version with the > correct indentation. Hope this helps! This is a really nice feature, and provides a lot of expressive power for such a small piece of code. I notice this doesn't seem to work for domains over json and jsonb. andrew@~=# create domain json_d as json; CREATE DOMAIN andrew@~=# create table test_json_dot(id int, test_json json_d); CREATE TABLE andrew@~=# insert into test_json_dot select 1, '{"a": 1, "b": 42}'::json; INSERT 0 1 | | andrew@~=# select (test_json_dot.test_json).b, json_query(test_json, 'lax $.b' WITH CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot; ERROR: column notation .b applied to type json_d, which is not a composite type LINE 1: select (test_json_dot.test_json).b, json_query(test_json, 'l... I'm not sure that's a terribly important use case, but we should probably make it work. If it's a domain we should get the basetype of the domain. There's some example code in src/backend/utils/adt/jsonfuncs.c cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On Sep 26, 2024, at 16:45, Alexandra Wang <alexandra.wang.oss@gmail.com> wrote: > I didn’t run pgindent earlier, so here’s the updated version with the > correct indentation. Hope this helps! Oh, nice! I don’t suppose the standard also has defined an operator equivalent to ->>, though, has it? I tend to want thetext output far more often than a JSON scalar. Best, David
On 2024-09-27 Fr 5:49 AM, David E. Wheeler wrote: > On Sep 26, 2024, at 16:45, Alexandra Wang <alexandra.wang.oss@gmail.com> wrote: > >> I didn’t run pgindent earlier, so here’s the updated version with the >> correct indentation. Hope this helps! > Oh, nice! I don’t suppose the standard also has defined an operator equivalent to ->>, though, has it? I tend to wantthe text output far more often than a JSON scalar. > That would defeat being able to chain these. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On Sep 27, 2024, at 12:07, Andrew Dunstan <andrew@dunslane.net> wrote: > That would defeat being able to chain these. Not if it’s a different operator. But I’m fine to just keep using ->> at the end of a chain. D
On 07.11.24 22:57, Alexandra Wang wrote: > The v5 patch includes the following updates: > > - Fixed the aforementioned issue and added more tests covering composite > types with domains, nested domains, and arrays of domains over > JSON/JSONB. > > - Refactored the logic for parsing JSON/JSONB object fields by moving it > from ParseFuncOrColumn() to transformIndirection() for improved > readability. The ParseFuncOrColumn() function is already handling both > single-argument function calls and composite types, and it has other > callers besides transformIndirection(). This patch implements array subscripting support for the json type, but it does it in a non-standard way, using ParseJsonSimplifiedAccessorArrayElement(). This would be better done by providing a typsubscript function for the json type. This is what jsonb already has, which is why your patch doesn't need to provide the array support for jsonb. I suggest you implement the typsubscript support for the json type (make it a separate patch but you can keep it in this thread IMO) and remove the custom code from this patch. A few comments on the tests: The tests look good to me. Good coverage of weirdly nested types. Results look correct. +drop table if exists test_json_dot; This can be omitted, since we know that the table doesn't exist yet. This code could be written in the more conventional insert ... values syntax: +insert into test_json_dot select 1, '{"a": 1, "b": 42}'::json; +insert into test_json_dot select 1, '{"a": 2, "b": {"c": 42}}'::json; +insert into test_json_dot select 1, '{"a": 3, "b": {"c": "42"}, "d":[11, 12]}'::json; +insert into test_json_dot select 1, '{"a": 3, "b": {"c": "42"}, "d":[{"x": [11, 12]}, {"y": [21, 22]}]}'::json; Then the ::json casts can also go away. Also, using a different value for "id" for each row would be more useful, so that the subsequent tests could then be written like select id, (test_jsonb_dot.test_jsonb).b from test_jsonb_dot; so we can see which result corresponds to which input row. Also make id the primary key in this table. Also, let's keep the json and the jsonb variants aligned. There are some small differences, like the test_json_dot table having 4 rows but the test_jsonb_dot having 3 rows. And the array and wildcard tests in the opposite order. Not a big deal, but keeping these the same helps eyeballing the test files. Maybe add a comment somewhere in this file that you are running the json_query equivalents to cross-check the semantics of the dot syntax. Some documentation should be written. This looks like this right place to start: https://www.postgresql.org/docs/devel/sql-expressions.html#FIELD-SELECTION and them maybe some cross-linking between there and the sections on JSON types and operators.
Hi, On Tue, Nov 19, 2024 at 6:06 PM Nikita Glukhov <glukhov.n.a@gmail.com> wrote: > > Hi, hackers. > > I have implemented dot notation for jsonb using type subscripting back > in April 2023, but failed post it because I left Postgres Professional > company soon after and have not worked anywhere since, not even had > any interest in programming. > > But yesterday I accidentally decided to look what is going on at > commitfests and found this thread. I immediately started to rebase > code from PG16, fixed some bugs, and now I'm ready to present my > version of the patches which is much more complex. > > Unfortunately, I probably won't be able to devote that much time to > the patches as before. Thank you so much, Nikita, for revisiting this topic and sharing your v6 patches! Now that we have two solutions, I’d like to summarize our current options. In Postgres, there are currently three ways to access json/jsonb object fields and array elements: 1. '->' operator (Postgres-specific, predates SQL standard): postgres=# select ('{"a": 1, "b": "c", "d": [1, 2, 3]}'::json) -> 'd' -> 0; -- returns 1 2. jsonb subscripting (not available for the plain json type): postgres=# select ('{"a": 1, "b": "c", "d": [1, 2, 3]}'::jsonb)['d'][0]; --returns 1 3. json_query() function: postgres=# select json_query(jsonb '{"a": 1, "b": "c", "d": [1, 2, 3]}', 'lax $.d[0]'); --returns 1 A few weeks ago, I did the following performance benchmarking of the three approaches: -- setup: create table tbl(id int, col1 jsonb); insert into tbl select i, '{"x":"vx", "y":[{"a":[1,2,3]}, {"b":[1, 2, {"j":"vj"}]}]}' from generate_series(1, 100000)i; -- jsonb_operator.sql SELECT id, col1 -> 'y' -> 1 -> 'b' -> 2 -> 'j' AS jsonb_operator FROM tbl; -- jsonb_subscripting.sql SELECT id, col1['y'][1]['b'][2]['j'] AS jsonb_subscript FROM tbl; -- jsonb_path_query.sql SELECT id, jsonb_path_query(col1, '$.y[1].b[2].j') FROM tbl; # pgbench on my local MacOS machine, using -O3 optimization: pgbench -n -f XXX.sql postgres -T100 Results (Latency | tps): "->" operator: 14ms | 68 jsonb subscripting: 17ms | 58 jsonb_path_query() function: 23ms | 43 So performance from best to worst: "->" operator > jsonb subscripting >> jsonb_path_query() function. I’m excited to see your implementation of dot notation for jsonb using type subscripting! This approach rounds out the three possible ways to implement JSON simplified accessors: ## v1: json_query() implementation Pros: - Fully adheres to the SQL standard. According to the SQL standard, if the JSON simplified accessor <JA> is not a JSON item method, it is equivalent to a <JSON query>: JSON_QUERY ( VEP, 'lax $.JC' WITH CONDITIONAL ARRAY WRAPPER NULL ON EMPTY NULL ON ERROR) (I’m skipping <JA> that includes a JSON item method, as it is currently outside the scope of both sets of patches.) - Easiest to implement Cons: - Slow due to function call overhead. ## v2-v5: "->" operator implementation We initially chose this approach for its performance benefits. However, while addressing Peter’s feedback on v5, I encountered the following issue: -- setup create table test_json_dot(id serial primary key, test_json json); insert into test_json_dot values (5, '[{"a": 1, "b": 42}, {"a": 2, "b": {"c": 42}}]'); -- problematic query: test1=# select id, (test_json).b, json_query(test_json, 'lax $.b' WITH CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot; id | b | expected ----+---+----------------- 5 | | [42, {"c": 42}] (1 row) This issue arises from the semantic differences between the "->" operator and json_query’s "lax" mode. One possible workaround is to redefine the "->" operator and modify its implementation. However, since the "->" operator has been in use for a long time, such changes would break backward compatibility. ## v6: jsonb subscription implementation Nikita's patches pass all my functional test cases, including those that failed with the previous approach. Supported formats: - JSON member accessor - JSON wildcard member accessor (Not available in v5, so this is also a plus) - JSON array accessor Questions: 1. Since Nikita’s patches did not address the JSON data type, and JSON currently does not support subscripting, should we limit the initial feature set to JSONB dot-notation for now? In other words, if we aim to fully support JSON simplified accessors for the plain JSON type, should we handle support for plain JSON subscripting as a follow-up effort? 2. I have yet to have a more thorough review of Nikita’s patches. One area I am not familiar with is the hstore-related changes. How relevant is hstore to the JSON simplified accessor? Best, Alex
On 2024-11-21 Th 3:52 PM, Alexandra Wang wrote: > Hi, > > On Tue, Nov 19, 2024 at 6:06 PM Nikita Glukhov <glukhov.n.a@gmail.com> wrote: >> Hi, hackers. >> >> I have implemented dot notation for jsonb using type subscripting back >> in April 2023, but failed post it because I left Postgres Professional >> company soon after and have not worked anywhere since, not even had >> any interest in programming. >> >> But yesterday I accidentally decided to look what is going on at >> commitfests and found this thread. I immediately started to rebase >> code from PG16, fixed some bugs, and now I'm ready to present my >> version of the patches which is much more complex. >> >> Unfortunately, I probably won't be able to devote that much time to >> the patches as before. > Thank you so much, Nikita, for revisiting this topic and sharing your > v6 patches! > > Now that we have two solutions, I’d like to summarize our current > options. > > In Postgres, there are currently three ways to access json/jsonb > object fields and array elements: > > 1. '->' operator (Postgres-specific, predates SQL standard): > > postgres=# select ('{"a": 1, "b": "c", "d": [1, 2, 3]}'::json) -> 'd' > -> 0; -- returns 1 > > 2. jsonb subscripting (not available for the plain json type): > > postgres=# select ('{"a": 1, "b": "c", "d": [1, 2, > 3]}'::jsonb)['d'][0]; --returns 1 > > 3. json_query() function: > > postgres=# select json_query(jsonb '{"a": 1, "b": "c", "d": [1, 2, > 3]}', 'lax $.d[0]'); --returns 1 > > A few weeks ago, I did the following performance benchmarking of the > three approaches: > > -- setup: > create table tbl(id int, col1 jsonb); > insert into tbl select i, '{"x":"vx", "y":[{"a":[1,2,3]}, {"b":[1, 2, > {"j":"vj"}]}]}' from generate_series(1, 100000)i; > > -- jsonb_operator.sql > SELECT id, col1 -> 'y' -> 1 -> 'b' -> 2 -> 'j' AS jsonb_operator FROM tbl; > > -- jsonb_subscripting.sql > SELECT id, col1['y'][1]['b'][2]['j'] AS jsonb_subscript FROM tbl; > > -- jsonb_path_query.sql > SELECT id, jsonb_path_query(col1, '$.y[1].b[2].j') FROM tbl; > > # pgbench on my local MacOS machine, using -O3 optimization: > pgbench -n -f XXX.sql postgres -T100 > > Results (Latency | tps): > > "->" operator: 14ms | 68 > jsonb subscripting: 17ms | 58 > jsonb_path_query() function: 23ms | 43 > > So performance from best to worst: > "->" operator > jsonb subscripting >> jsonb_path_query() function. > > I’m excited to see your implementation of dot notation for jsonb using > type subscripting! This approach rounds out the three possible ways to > implement JSON simplified accessors: > > ## v1: json_query() implementation > > Pros: > - Fully adheres to the SQL standard. > > According to the SQL standard, if the JSON simplified accessor <JA> is > not a JSON item method, it is equivalent to a <JSON query>: > > JSON_QUERY ( VEP, 'lax $.JC' WITH CONDITIONAL ARRAY WRAPPER NULL ON > EMPTY NULL ON ERROR) > > (I’m skipping <JA> that includes a JSON item method, as it is > currently outside the scope of both sets of patches.) > > - Easiest to implement > > Cons: > - Slow due to function call overhead. > > ## v2-v5: "->" operator implementation > > We initially chose this approach for its performance benefits. > However, while addressing Peter’s feedback on v5, I encountered the > following issue: > > -- setup > create table test_json_dot(id serial primary key, test_json json); > insert into test_json_dot values (5, '[{"a": 1, "b": 42}, {"a": 2, > "b": {"c": 42}}]'); > > -- problematic query: > test1=# select id, (test_json).b, json_query(test_json, 'lax $.b' WITH > CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from > test_json_dot; > id | b | expected > ----+---+----------------- > 5 | | [42, {"c": 42}] > (1 row) > > This issue arises from the semantic differences between the "->" > operator and json_query’s "lax" mode. One possible workaround is to > redefine the "->" operator and modify its implementation. However, since > the "->" operator has been in use for a long time, such changes would > break backward compatibility. > > ## v6: jsonb subscription implementation > > Nikita's patches pass all my functional test cases, including those > that failed with the previous approach. > > Supported formats: > - JSON member accessor > - JSON wildcard member accessor (Not available in v5, so this is also a plus) > - JSON array accessor > > Questions: > > 1. Since Nikita’s patches did not address the JSON data type, and JSON > currently does not support subscripting, should we limit the initial > feature set to JSONB dot-notation for now? In other words, if we aim > to fully support JSON simplified accessors for the plain JSON type, > should we handle support for plain JSON subscripting as a follow-up > effort? > > 2. I have yet to have a more thorough review of Nikita’s patches. > One area I am not familiar with is the hstore-related changes. How > relevant is hstore to the JSON simplified accessor? > We can't change the way the "->" operator works, as there could well be uses of it in the field that rely on its current behaviour. But maybe we could invent a new operator which is compliant with the standard semantics for dot access, and call that. Then we'd get the best performance, and also we might be able to implement it for the plain JSON type. If that proves not possible we can think about not implementing for plain JSON, but I'd rather not go there until we have to. I don't think we should be including hstore changes here - we should just be aiming at implementing the standard for JSON access. hstore changes if any should be a separate feature. The aren't relevant to JSON access, although they might use some of the same infrastructure, depending on implementation. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 21.11.24 23:46, Andrew Dunstan wrote: >> Questions: >> >> 1. Since Nikita’s patches did not address the JSON data type, and JSON >> currently does not support subscripting, should we limit the initial >> feature set to JSONB dot-notation for now? In other words, if we aim >> to fully support JSON simplified accessors for the plain JSON type, >> should we handle support for plain JSON subscripting as a follow-up >> effort? >> >> 2. I have yet to have a more thorough review of Nikita’s patches. >> One area I am not familiar with is the hstore-related changes. How >> relevant is hstore to the JSON simplified accessor? >> > > We can't change the way the "->" operator works, as there could well be > uses of it in the field that rely on its current behaviour. But maybe we > could invent a new operator which is compliant with the standard > semantics for dot access, and call that. Then we'd get the best > performance, and also we might be able to implement it for the plain > JSON type. If that proves not possible we can think about not > implementing for plain JSON, but I'd rather not go there until we have to. Yes, I think writing a custom operator that is similar to "->" but has the required semantics is the best way forward. (Maybe it can be just a function?) > I don't think we should be including hstore changes here - we should > just be aiming at implementing the standard for JSON access. hstore > changes if any should be a separate feature. The aren't relevant to JSON > access, although they might use some of the same infrastructure, > depending on implementation. In a future version, the operator/function mentioned above could be a catalogued property of a type, similar to typsubscript. Then you could also apply this to other types. But let's leave that for later. If I understand it correctly, Nikita's patch uses the typsubscript support function to handle both bracket subscripting and dot notation. I'm not sure if it's right to mix these two together. Maybe I didn't understand that correctly.
Hello hackers,
I’ve fixed the compilation failure for hstore and updated the patches.
In this version, I’ve further cleaned up the code and added more
comments. I hope this helps!
Summary of changes:
v8-0001 through v8-0005:
Refactoring and preparatory steps for the actual implementation.
v8-0006 (Implement read-only dot notation for JSONB):
I removed the vars field (introduced in v7) from JsonbSubWorkspace
after realizing that JsonPathVariable is not actually needed for
dot-notation.
v8-0007 (Allow wildcard member access for JSONB):
I'm aware that the #if 0 in check_indirection() is not ideal. I
haven’t removed it yet because I’m still reviewing other cases—beyond
our JSONB simplified accessor use case—where this check should remain
strict. I’ll post an additional patch to address this.
Looking forward to comments and feedback!
Thanks,
Alex
I’ve fixed the compilation failure for hstore and updated the patches.
In this version, I’ve further cleaned up the code and added more
comments. I hope this helps!
Summary of changes:
v8-0001 through v8-0005:
Refactoring and preparatory steps for the actual implementation.
v8-0006 (Implement read-only dot notation for JSONB):
I removed the vars field (introduced in v7) from JsonbSubWorkspace
after realizing that JsonPathVariable is not actually needed for
dot-notation.
v8-0007 (Allow wildcard member access for JSONB):
I'm aware that the #if 0 in check_indirection() is not ideal. I
haven’t removed it yet because I’m still reviewing other cases—beyond
our JSONB simplified accessor use case—where this check should remain
strict. I’ll post an additional patch to address this.
Looking forward to comments and feedback!
Thanks,
Alex
Attachment
- v8-0002-Pass-field-accessors-to-generic-subscripting.patch
- v8-0004-Extract-coerce_jsonpath_subscript.patch
- v8-0001-Allow-transformation-only-of-a-sublist-of-subscri.patch
- v8-0005-Eanble-String-node-as-field-accessors-in-generic-.patch
- v8-0006-Implement-read-only-dot-notation-for-jsonb.patch
- v8-0003-Export-jsonPathFromParseResult.patch
- v8-0007-Allow-wild-card-member-access-for-jsonb.patch
Hello hackers,
On Thu, Feb 27, 2025 at 9:46 AM Alexandra Wang <alexandra.wang.oss@gmail.com> wrote:
Summary of changes:
v8-0001 through v8-0005:
Refactoring and preparatory steps for the actual implementation.
v8-0006 (Implement read-only dot notation for JSONB):
I removed the vars field (introduced in v7) from JsonbSubWorkspace
after realizing that JsonPathVariable is not actually needed for
dot-notation.
v8-0007 (Allow wildcard member access for JSONB):
I'm aware that the #if 0 in check_indirection() is not ideal. I
haven’t removed it yet because I’m still reviewing other cases—beyond
our JSONB simplified accessor use case—where this check should remain
strict. I’ll post an additional patch to address this.
I made the following minor changes in v9:
- More detailed commit messages
- Additional tests
- Use "?column?" as the column name for trailing .*.
Other than that, the patches remain the same as the previous
version:
0001 - 0005: preparation steps for the actual implementation and do
not change or add new behavior.
0006: jsonb dot notation and sliced subscripting
0007: jsonb wildcard member access
version:
0001 - 0005: preparation steps for the actual implementation and do
not change or add new behavior.
0006: jsonb dot notation and sliced subscripting
0007: jsonb wildcard member access
Thanks,
Alex
Attachment
- v9-0003-Export-jsonPathFromParseResult.patch
- v9-0002-Allow-Generic-Type-Subscripting-to-Accept-Dot-Not.patch
- v9-0004-Extract-coerce_jsonpath_subscript.patch
- v9-0001-Allow-transformation-of-only-a-sublist-of-subscri.patch
- v9-0006-Implement-read-only-dot-notation-for-jsonb.patch
- v9-0007-Allow-wild-card-member-access-for-jsonb.patch
- v9-0005-Eanble-String-node-as-field-accessors-in-generic-.patch
Hi Alex, The code comments and the commit messages help a lot when reviewing! Thanks for the new version. The code LGTM and check-world is happy. I've also performed some tests and everything looks good! Just some minor points about this new version: ## v9-0005 Typo on commit message title ## v9-0006 > + * The following functions create various types of JsonPathParseItem nodes, > + * which are used to build JsonPath expressions for jsonb simplified accessor. > Just to avoid misinterpretation I think that we can replace "The following functions" with "The make_jsonpath_item_* functions" since we can have more functions in the future that are not fully related with these. Does that make sense? -- Matheus Alcantara
On Mon, Mar 3, 2025 at 4:16 PM Matheus Alcantara <matheusssilv97@gmail.com> wrote: > > Hi Alex, > > The code comments and the commit messages help a lot when reviewing! Thanks for > the new version. > > The code LGTM and check-world is happy. I've also performed some tests and > everything looks good! > > Just some minor points about this new version: > > ## v9-0005 > > Typo on commit message title > > ## v9-0006 > > > + * The following functions create various types of JsonPathParseItem nodes, > > + * which are used to build JsonPath expressions for jsonb simplified accessor. > > > Just to avoid misinterpretation I think that we can replace "The following > functions" with "The make_jsonpath_item_* functions" since we can have more > functions in the future that are not fully related with these. Does that make > sense? > Sorry, I've forgotten to include a question. Do you have anything in mind about documentation changes for this patch? -- Matheus Alcantara
Hi Matheus,
On Mon, Mar 3, 2025 at 1:43 PM Matheus Alcantara <matheusssilv97@gmail.com> wrote:
On Mon, Mar 3, 2025 at 4:16 PM Matheus Alcantara
<matheusssilv97@gmail.com> wrote:
>
> Hi Alex,
>
> The code comments and the commit messages help a lot when reviewing! Thanks for
> the new version.
>
> The code LGTM and check-world is happy. I've also performed some tests and
> everything looks good!
> Just some minor points about this new version:
>
> ## v9-0005
>
> Typo on commit message title
> ## v9-0006
>
> > + * The following functions create various types of JsonPathParseItem nodes,
> > + * which are used to build JsonPath expressions for jsonb simplified accessor.
> >
> Just to avoid misinterpretation I think that we can replace "The following
> functions" with "The make_jsonpath_item_* functions" since we can have more
> functions in the future that are not fully related with these. Does that make
> sense?
Thank you so much for reviewing! I've attached v10, which addresses your feedback.
On Mon, Mar 3, 2025 at 1:43 PM Matheus Alcantara <matheusssilv97@gmail.com> wrote:
Sorry, I've forgotten to include a question. Do you have anything in mind about
documentation changes for this patch?
For the documentation, I’m thinking of adding it under JSON Types [1].
I’d either add a new “Simple Dot-Notation” section after jsonb
subscripting [2] or replace it. Let me know what you think.
I’d either add a new “Simple Dot-Notation” section after jsonb
subscripting [2] or replace it. Let me know what you think.
Best,
Alex
Attachment
- v10-0004-Extract-coerce_jsonpath_subscript.patch
- v10-0002-Allow-Generic-Type-Subscripting-to-Accept-Dot-No.patch
- v10-0003-Export-jsonPathFromParseResult.patch
- v10-0005-Enable-String-node-as-field-accessors-in-generic.patch
- v10-0001-Allow-transformation-of-only-a-sublist-of-subscr.patch
- v10-0006-Implement-read-only-dot-notation-for-jsonb.patch
- v10-0007-Allow-wild-card-member-access-for-jsonb.patch
On Mon, Mar 3, 2025 at 12:23 PM Alexandra Wang <alexandra.wang.oss@gmail.com> wrote:
I've attached v10, which addresses your feedback.
Hi Alex! Thanks for the patches.
In src/test/regress/sql/jsonb.sql, the section marked with "-- slices are not supported" should be relabeled. That comment predates these patches, and is now misleading.
A bit further down in expected/jsonb.out, there is an expected failure, but no SQL comment to indicate that it is expected:
+SELECT (t.jb).* FROM test_jsonb_dot_notation;
+ERROR: missing FROM-clause entry for table "t"
+LINE 1: SELECT (t.jb).* FROM test_jsonb_dot_notation;
+ERROR: missing FROM-clause entry for table "t"
+LINE 1: SELECT (t.jb).* FROM test_jsonb_dot_notation;
Perhaps a "-- fails" comment would clarify? Then, further down,
+SELECT (jb).a.**.x FROM test_jsonb_dot_notation; -- not supported
+ERROR: syntax error at or near "**"
+LINE 1: SELECT (jb).a.**.x FROM test_jsonb_dot_notation;
+ERROR: syntax error at or near "**"
+LINE 1: SELECT (jb).a.**.x FROM test_jsonb_dot_notation;
I wonder if it would be better to have the parser handle this case and raise a ERRCODE_FEATURE_NOT_SUPPORTED instead?
I got curious about the support for this new dot notation in the plpgsql parser and tried:
+DO $$
+DECLARE
+ a jsonb := '[1,2,3,4,5,6,7]'::jsonb;
+BEGIN
+ WHILE a IS NOT NULL
+ LOOP
+ RAISE NOTICE '%', a;
+ a := a[2:];
+ END LOOP;
+END
+$$ LANGUAGE plpgsql;
+NOTICE: [1, 2, 3, 4, 5, 6, 7]
+NOTICE: [3, 4, 5, 6, 7]
+NOTICE: [5, 6, 7]
+NOTICE: 7
+DECLARE
+ a jsonb := '[1,2,3,4,5,6,7]'::jsonb;
+BEGIN
+ WHILE a IS NOT NULL
+ LOOP
+ RAISE NOTICE '%', a;
+ a := a[2:];
+ END LOOP;
+END
+$$ LANGUAGE plpgsql;
+NOTICE: [1, 2, 3, 4, 5, 6, 7]
+NOTICE: [3, 4, 5, 6, 7]
+NOTICE: [5, 6, 7]
+NOTICE: 7
which looks good! But then I tried:
+DO $$
+DECLARE
+ a jsonb := '{"": 6, "NU": [{"": [[3]]}, [6], [2], "bCi"], "aaf": [-6, -8]}'::jsonb;
+BEGIN
+ WHILE a IS NOT NULL
+ LOOP
+ RAISE NOTICE '%', a;
+ a := COALESCE(a."NU", a[2]);
+ END LOOP;
+END
+$$ LANGUAGE plpgsql;
+NOTICE: {"": 6, "NU": [{"": [[3]]}, [6], [2], "bCi"], "aaf": [-6, -8]}
+ERROR: missing FROM-clause entry for table "a"
+LINE 1: a := COALESCE(a."NU", a[2])
+ ^
+QUERY: a := COALESCE(a."NU", a[2])
+CONTEXT: PL/pgSQL function inline_code_block line 8 at assignment
+DECLARE
+ a jsonb := '{"": 6, "NU": [{"": [[3]]}, [6], [2], "bCi"], "aaf": [-6, -8]}'::jsonb;
+BEGIN
+ WHILE a IS NOT NULL
+ LOOP
+ RAISE NOTICE '%', a;
+ a := COALESCE(a."NU", a[2]);
+ END LOOP;
+END
+$$ LANGUAGE plpgsql;
+NOTICE: {"": 6, "NU": [{"": [[3]]}, [6], [2], "bCi"], "aaf": [-6, -8]}
+ERROR: missing FROM-clause entry for table "a"
+LINE 1: a := COALESCE(a."NU", a[2])
+ ^
+QUERY: a := COALESCE(a."NU", a[2])
+CONTEXT: PL/pgSQL function inline_code_block line 8 at assignment
which suggests the plpgsql parser does not recognize a."NU" as we'd expect. Any thoughts on this?
I notice there are no changes in src/interfaces/ecpg/test, which concerns me. The sqljson.pgc and sqljson_jsontable.pgc files are already testing json handling in ecpg; perhaps just extend those a bit?
On Tue, Mar 4, 2025 at 6:05 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
But then I tried:+DO $$
+DECLARE
+ a jsonb := '{"": 6, "NU": [{"": [[3]]}, [6], [2], "bCi"], "aaf": [-6, -8]}'::jsonb;
+BEGIN
+ WHILE a IS NOT NULL
+ LOOP
+ RAISE NOTICE '%', a;
+ a := COALESCE(a."NU", a[2]);
+ END LOOP;
+END
+$$ LANGUAGE plpgsql;
+NOTICE: {"": 6, "NU": [{"": [[3]]}, [6], [2], "bCi"], "aaf": [-6, -8]}
+ERROR: missing FROM-clause entry for table "a"
+LINE 1: a := COALESCE(a."NU", a[2])
+ ^
+QUERY: a := COALESCE(a."NU", a[2])
+CONTEXT: PL/pgSQL function inline_code_block line 8 at assignmentwhich suggests the plpgsql parser does not recognize a."NU" as we'd expect. Any thoughts on this?
I should mention that
+DO $$
+DECLARE+ a jsonb := '{"": 6, "NU": [{"": [[3]]}, [6], [2], "bCi"], "aaf": [-6, -8]}'::jsonb;
+BEGIN
+ WHILE a IS NOT NULL
+ LOOP
+ RAISE NOTICE '%', a;
+ a := COALESCE((a)."NU", (a)[2]);
+ END LOOP;
+END
+$$ LANGUAGE plpgsql;
+NOTICE: {"": 6, "NU": [{"": [[3]]}, [6], [2], "bCi"], "aaf": [-6, -8]}
+NOTICE: [{"": [[3]]}, [6], [2], "bCi"]
+NOTICE: [2]
works fine. I guess that is good enough. Should we add these to the sql/jsonb.sql to document the expected behavior, both with the error when using plain "a" and with the correct output when using "(a)"? The reason I mention this is that the plpgsql parser might get changed at some point, and without a test case, we might not notice if this breaks.
On 2025-03-04 Tu 10:34 AM, Mark Dilger wrote:
On Tue, Mar 4, 2025 at 6:05 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:But then I tried:+DO $$
+DECLARE
+ a jsonb := '{"": 6, "NU": [{"": [[3]]}, [6], [2], "bCi"], "aaf": [-6, -8]}'::jsonb;
+BEGIN
+ WHILE a IS NOT NULL
+ LOOP
+ RAISE NOTICE '%', a;
+ a := COALESCE(a."NU", a[2]);
+ END LOOP;
+END
+$$ LANGUAGE plpgsql;
+NOTICE: {"": 6, "NU": [{"": [[3]]}, [6], [2], "bCi"], "aaf": [-6, -8]}
+ERROR: missing FROM-clause entry for table "a"
+LINE 1: a := COALESCE(a."NU", a[2])
+ ^
+QUERY: a := COALESCE(a."NU", a[2])
+CONTEXT: PL/pgSQL function inline_code_block line 8 at assignmentwhich suggests the plpgsql parser does not recognize a."NU" as we'd expect. Any thoughts on this?I should mention that+DO $$+DECLARE
+ a jsonb := '{"": 6, "NU": [{"": [[3]]}, [6], [2], "bCi"], "aaf": [-6, -8]}'::jsonb;
+BEGIN
+ WHILE a IS NOT NULL
+ LOOP
+ RAISE NOTICE '%', a;
+ a := COALESCE((a)."NU", (a)[2]);
+ END LOOP;
+END
+$$ LANGUAGE plpgsql;
+NOTICE: {"": 6, "NU": [{"": [[3]]}, [6], [2], "bCi"], "aaf": [-6, -8]}
+NOTICE: [{"": [[3]]}, [6], [2], "bCi"]
+NOTICE: [2]works fine. I guess that is good enough. Should we add these to the sql/jsonb.sql to document the expected behavior, both with the error when using plain "a" and with the correct output when using "(a)"? The reason I mention this is that the plpgsql parser might get changed at some point, and without a test case, we might not notice if this breaks.
Yes, I think so.
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
Hi Mark,
Thank you so much for reviewing! I have attached the new patches.
On Tue, Mar 4, 2025 at 8:05 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
On Mon, Mar 3, 2025 at 12:23 PM Alexandra Wang <alexandra.wang.oss@gmail.com> wrote:I've attached v10, which addresses your feedback.Hi Alex! Thanks for the patches.In src/test/regress/sql/jsonb.sql, the section marked with "-- slices are not supported" should be relabeled. That comment predates these patches, and is now misleading.A bit further down in expected/jsonb.out, there is an expected failure, but no SQL comment to indicate that it is expected:+SELECT (t.jb).* FROM test_jsonb_dot_notation;
+ERROR: missing FROM-clause entry for table "t"
+LINE 1: SELECT (t.jb).* FROM test_jsonb_dot_notation;Perhaps a "-- fails" comment would clarify? Then, further down,
Fixed.
+SELECT (jb).a.**.x FROM test_jsonb_dot_notation; -- not supported
+ERROR: syntax error at or near "**"
+LINE 1: SELECT (jb).a.**.x FROM test_jsonb_dot_notation;I wonder if it would be better to have the parser handle this case and raise a ERRCODE_FEATURE_NOT_SUPPORTED instead?
In 0008 I added a new token named "DOUBLE_ASTERISK" to the lexers to
represent "**". Hope this helps!
I got curious about the support for this new dot notation in the plpgsql parser and tried:+DO $$
+DECLARE
+ a jsonb := '[1,2,3,4,5,6,7]'::jsonb;
+BEGIN
+ WHILE a IS NOT NULL
+ LOOP
+ RAISE NOTICE '%', a;
+ a := a[2:];
+ END LOOP;
+END
+$$ LANGUAGE plpgsql;
+NOTICE: [1, 2, 3, 4, 5, 6, 7]
+NOTICE: [3, 4, 5, 6, 7]
+NOTICE: [5, 6, 7]
+NOTICE: 7which looks good! But then I tried:+DO $$
+DECLARE
+ a jsonb := '{"": 6, "NU": [{"": [[3]]}, [6], [2], "bCi"], "aaf": [-6, -8]}'::jsonb;
+BEGIN
+ WHILE a IS NOT NULL
+ LOOP
+ RAISE NOTICE '%', a;
+ a := COALESCE(a."NU", a[2]);
+ END LOOP;
+END
+$$ LANGUAGE plpgsql;
+NOTICE: {"": 6, "NU": [{"": [[3]]}, [6], [2], "bCi"], "aaf": [-6, -8]}
+ERROR: missing FROM-clause entry for table "a"
+LINE 1: a := COALESCE(a."NU", a[2])
+ ^
+QUERY: a := COALESCE(a."NU", a[2])
+CONTEXT: PL/pgSQL function inline_code_block line 8 at assignmentwhich suggests the plpgsql parser does not recognize a."NU" as we'd expect. Any thoughts on this?
Thanks for the tests! I added them to the "jsonb" regress test.
I notice there are no changes in src/interfaces/ecpg/test, which concerns me. The sqljson.pgc and sqljson_jsontable.pgc files are already testing json handling in ecpg; perhaps just extend those a bit?
Thanks for bringing this up! I have added new tests in src/interfaces/ecpg/test/sql/sqljson.pgc.
Best,
Alex
Attachment
- v11-0002-Allow-Generic-Type-Subscripting-to-Accept-Dot-No.patch
- v11-0003-Export-jsonPathFromParseResult.patch
- v11-0004-Extract-coerce_jsonpath_subscript.patch
- v11-0005-Enable-String-node-as-field-accessors-in-generic.patch
- v11-0001-Allow-transformation-of-only-a-sublist-of-subscr.patch
- v11-0007-Allow-wild-card-member-access-for-jsonb.patch
- v11-0006-Implement-read-only-dot-notation-for-jsonb.patch
- v11-0008-Add-as-a-new-token-in-scanners.patch
This patch set has expanded significantly in scope recently, which is probably the right thing, but that means there won't be enough time to review and finish it for PG18. So I'm moving this to the next commitfest now. On 13.03.25 15:02, Alexandra Wang wrote: > Hi Mark, > > Thank you so much for reviewing! I have attached the new patches. > > On Tue, Mar 4, 2025 at 8:05 AM Mark Dilger <mark.dilger@enterprisedb.com > <mailto:mark.dilger@enterprisedb.com>> wrote: > > > On Mon, Mar 3, 2025 at 12:23 PM Alexandra Wang > <alexandra.wang.oss@gmail.com <mailto:alexandra.wang.oss@gmail.com>> > wrote: > > I've attached v10, which addresses your feedback. > > > Hi Alex! Thanks for the patches. > > In src/test/regress/sql/jsonb.sql, the section marked with "-- > slices are not supported" should be relabeled. That comment > predates these patches, and is now misleading. > > A bit further down in expected/jsonb.out, there is an expected > failure, but no SQL comment to indicate that it is expected: > > +SELECT (t.jb).* FROM test_jsonb_dot_notation; > +ERROR: missing FROM-clause entry for table "t" > +LINE 1: SELECT (t.jb).* FROM test_jsonb_dot_notation; > > Perhaps a "-- fails" comment would clarify? Then, further down, > > > Fixed. > > +SELECT (jb).a.**.x FROM test_jsonb_dot_notation; -- not supported > +ERROR: syntax error at or near "**" > +LINE 1: SELECT (jb).a.**.x FROM test_jsonb_dot_notation; > > I wonder if it would be better to have the parser handle this case > and raise a ERRCODE_FEATURE_NOT_SUPPORTED instead? > > > In 0008 I added a new token named "DOUBLE_ASTERISK" to the lexers to > represent "**". Hope this helps! > > I got curious about the support for this new dot notation in the > plpgsql parser and tried: > > +DO $$ > +DECLARE > + a jsonb := '[1,2,3,4,5,6,7]'::jsonb; > +BEGIN > + WHILE a IS NOT NULL > + LOOP > + RAISE NOTICE '%', a; > + a := a[2:]; > + END LOOP; > +END > +$$ LANGUAGE plpgsql; > +NOTICE: [1, 2, 3, 4, 5, 6, 7] > +NOTICE: [3, 4, 5, 6, 7] > +NOTICE: [5, 6, 7] > +NOTICE: 7 > > which looks good! But then I tried: > > +DO $$ > +DECLARE > + a jsonb := '{"": 6, "NU": [{"": [[3]]}, [6], [2], "bCi"], "aaf": > [-6, -8]}'::jsonb; > +BEGIN > + WHILE a IS NOT NULL > + LOOP > + RAISE NOTICE '%', a; > + a := COALESCE(a."NU", a[2]); > + END LOOP; > +END > +$$ LANGUAGE plpgsql; > +NOTICE: {"": 6, "NU": [{"": [[3]]}, [6], [2], "bCi"], "aaf": [-6, -8]} > +ERROR: missing FROM-clause entry for table "a" > +LINE 1: a := COALESCE(a."NU", a[2]) > + ^ > +QUERY: a := COALESCE(a."NU", a[2]) > +CONTEXT: PL/pgSQL function inline_code_block line 8 at assignment > > which suggests the plpgsql parser does not recognize a."NU" as we'd > expect. Any thoughts on this? > > > Thanks for the tests! I added them to the "jsonb" regress test. > > I notice there are no changes in src/interfaces/ecpg/test, which > concerns me. The sqljson.pgc and sqljson_jsontable.pgc files are > already testing json handling in ecpg; perhaps just extend those a bit? > > Thanks for bringing this up! I have added new tests in src/interfaces/ > ecpg/test/sql/sqljson.pgc. > > > Best, > Alex
On 13/03/2025 15:02, Alexandra Wang wrote: > Hi Mark, > > Thank you so much for reviewing! I have attached the new patches. > Hi Alex, I am reviewing this from a feature perspective and not from a code perspective. On the whole, this looks good to me from a standards point of view. There are two things that stand out to me about this feature: 1) I am not seeing the ** syntax in the standard, so does it need to be signaled as not supported? Perhaps I am just overlooking something. 2) There is no support for <JSON item method>. Since this appears to be constructing a jsonpath query, could that not be added to the syntax and allow jsonpath to throw the error if the function doesn't exist? -- Vik Fearing