Thread: Patch bug: Fix jsonpath .* on Arrays

Patch bug: Fix jsonpath .* on Arrays

From
"David E. Wheeler"
Date:
Hackers,

The behavior of the .* jpiAnyKey jsonpath selector seems incorrect.

```
select jsonb_path_query('[1,2,3]', '$.*');
jsonb_path_query
------------------
(0 rows)

select jsonb_path_query('[1,2,3,{"b": [3,4,5]}]', '$.*');
jsonb_path_query
------------------
[3, 4, 5]
```

The first example might be expected, since .* is intended for object keys, but the handing of `jpiAnyKey` has a branch
forunwrapping arrays. The second example, however, just seems weird: this is .*, not .**. 

The attached patch fixes it by passing the next node to `executeAnyItem()` (via `executeItemUnwrapTargetArray()`) and
thenproperly setting `jperOk` when `executeAnyItem()` finds values and there is no current (next) node. 

I took this approach given what appears to be the intended behavior or $* on arrays in lax mode. However, I could see
anargument that .* should not apply to arrays at all. If so, I can submit a new patch removing the branch that unwraps
anarray with .*. 

Best,

David


Attachment

Re: Patch bug: Fix jsonpath .* on Arrays

From
"David G. Johnston"
Date:
On Tuesday, June 4, 2024, David E. Wheeler <david@justatheory.com> wrote:
Hackers,

The behavior of the .* jpiAnyKey jsonpath selector seems incorrect.

```
select jsonb_path_query('[1,2,3]', '$.*');
jsonb_path_query
------------------
(0 rows)

select jsonb_path_query('[1,2,3,{"b": [3,4,5]}]', '$.*');
jsonb_path_query
------------------
[3, 4, 5]
```

The first example might be expected, since .* is intended for object keys, but the handing of `jpiAnyKey` has a branch for unwrapping arrays. The second example, however, just seems weird: this is .*, not .**.

This seems to be working correctly. Lax mode causes the first array level to unwrap and produce new context item values.  Then the wildcard member accessor is applied to each.  Numbers don’t have members so no matches exist in the first example.  The object in the second indeed has a single member and so matches the wildcard and its value, the array, is returned.

David J.

Re: Patch bug: Fix jsonpath .* on Arrays

From
David E. Wheeler
Date:
On Jun 4, 2024, at 12:28 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:

> This seems to be working correctly. Lax mode causes the first array level to unwrap and produce new context item
values. Then the wildcard member accessor is applied to each.  Numbers don’t have members so no matches exist in the
firstexample.  The object in the second indeed has a single member and so matches the wildcard and its value, the
array,is returned. 

Oh FFS, unwrapping still breaks my brain. You’re right, of course. Here’s a new patch that demonstrates that behavior,
sincethat code path is not currently represented in tests AFAICT (I would have expected to have broken it with this
patch).

D


Attachment

Re: Patch bug: Fix jsonpath .* on Arrays

From
"David E. Wheeler"
Date:
On Jun 4, 2024, at 20:45, David E. Wheeler <david@justatheory.com> wrote:

> Here’s a new patch that demonstrates that behavior, since that code path is not currently represented in tests AFAICT
(Iwould have expected to have broken it with this patch). 

Commitfest link:

  https://commitfest.postgresql.org/48/5017/

D




Re: Patch bug: Fix jsonpath .* on Arrays

From
"David E. Wheeler"
Date:
On Jun 4, 2024, at 20:45, David E. Wheeler <david@justatheory.com> wrote:

> Oh FFS, unwrapping still breaks my brain. You’re right, of course. Here’s a new patch that demonstrates that
behavior,since that code path is not currently represented in tests AFAICT (I would have expected to have broken it
withthis patch). 

Rebased and moved the new tests to the end of the file.

D

Attachment

Re: Patch bug: Fix jsonpath .* on Arrays

From
"David E. Wheeler"
Date:
On Jun 7, 2024, at 10:23, David E. Wheeler <david@justatheory.com> wrote:

> Rebased and moved the new tests to the end of the file.

Bah, sorry, that was the previous patch. Here’s v3.

D


Attachment

Re: Patch bug: Fix jsonpath .* on Arrays

From
Степан Неретин
Date:


 
Вторник, 25 июня 2024, 11:17 +07:00 от David E. Wheeler <david@justatheory.com>:
 
On Jun 7, 2024, at 10:23, David E. Wheeler <david@justatheory.com> wrote:
 
> Rebased and moved the new tests to the end of the file.

Bah, sorry, that was the previous patch. Here’s v3.

D
 
 
 
Hi! Looks good to me, but I have several comments.
Your patch improves tests, but why did you change formatting in jsonpath_exec.c? What's the motivation?
 
[1] select jsonb_path_query('[1,2,3,{"b": [3,4,5]}]', 'strict $.*');
I propose adding a similar test with explicitly specified lax mode: select jsonb_path_query('[1,2,3,{"b": [3,4,5]}]', 'lax $.*'); to show what lax mode is set by default.
 
Odd empty result for the test: select jsonb '[1,2,3,{"b": [3,4,5]}]' @? 'strict $.*';
I expected an error like in test [1]. This behavior is not obvious to me.
 
Everything else is cool. Thanks to the patch and the discussion above, I began to understand better how wildcards in JSON work.

Best regards, Stepan Neretin.
 

Re: Patch bug: Fix jsonpath .* on Arrays

From
"David E. Wheeler"
Date:
On Jun 25, 2024, at 12:46 AM, Степан Неретин <fenixrnd@mail.ru> wrote:

> Hi! Looks good to me, but I have several comments.

Thanks for your review!

> Your patch improves tests, but why did you change formatting in jsonpath_exec.c? What's the motivation?

It’s not just formatting. From the commit message:

> While at it, teach `executeAnyItem()` to return `jperOk` when `found`
> exist, not because it will be used (the result and `found` are inspected
> by different functions), but because it seems like the proper thing to
> return from `executeAnyItem()` when considered in isolation.


I have since realized it’s not a complete fix for the issue, and hacked around it in my Go version. Would be fine to
removethat bit, but IIRC this was the only execution function that would return `jperNotFound` when it in fact adds
itemsto the `found` list. The current implementation only looks at one or the other, so it’s not super important, but I
foundthe inconsistency annoying and sometimes confusing. 

>  [1] select jsonb_path_query('[1,2,3,{"b": [3,4,5]}]', 'strict $.*');
> I propose adding a similar test with explicitly specified lax mode: select jsonb_path_query('[1,2,3,{"b": [3,4,5]}]',
'lax$.*'); to show what lax mode is set by default. 

Very few of the other tests do so; I can add it if it’s important for this case, though.

> Odd empty result for the test: select jsonb '[1,2,3,{"b": [3,4,5]}]' @? 'strict $.*';
> I expected an error like in test [1]. This behavior is not obvious to me.

@? suppresses a number of errors. Perhaps I should add a variant of the error-raising query that passes the silent arg,
whichwould also suppress the error. 

> Everything else is cool. Thanks to the patch and the discussion above, I began to understand better how wildcards in
JSONwork. 

Yeah, it’s kind of wild TBH.

Best,

David




Re: Patch bug: Fix jsonpath .* on Arrays

From
"David E. Wheeler"
Date:
On Jun 25, 2024, at 13:48, David E. Wheeler <david@justatheory.com> wrote:

> I have since realized it’s not a complete fix for the issue, and hacked around it in my Go version. Would be fine to
removethat bit, but IIRC this was the only execution function that would return `jperNotFound` when it in fact adds
itemsto the `found` list. The current implementation only looks at one or the other, so it’s not super important, but I
foundthe inconsistency annoying and sometimes confusing. 

I’ve removed this change.

>> [1] select jsonb_path_query('[1,2,3,{"b": [3,4,5]}]', 'strict $.*');
>> I propose adding a similar test with explicitly specified lax mode: select jsonb_path_query('[1,2,3,{"b":
[3,4,5]}]','lax $.*'); to show what lax mode is set by default. 
>
> Very few of the other tests do so; I can add it if it’s important for this case, though.

Went ahead and added lax.

> @? suppresses a number of errors. Perhaps I should add a variant of the error-raising query that passes the silent
arg,which would also suppress the error. 

Added a variant where the silent param suppresses the error, too.

V2 attached and the PR updated:

  https://github.com/theory/postgres/pull/4/files

Best,

David




Attachment

Re: Patch bug: Fix jsonpath .* on Arrays

From
Stepan Neretin
Date:


On Thu, Jun 27, 2024 at 1:16 AM David E. Wheeler <david@justatheory.com> wrote:
On Jun 25, 2024, at 13:48, David E. Wheeler <david@justatheory.com> wrote:

> I have since realized it’s not a complete fix for the issue, and hacked around it in my Go version. Would be fine to remove that bit, but IIRC this was the only execution function that would return `jperNotFound` when it in fact adds items to the `found` list. The current implementation only looks at one or the other, so it’s not super important, but I found the inconsistency annoying and sometimes confusing.

I’ve removed this change.

>> [1] select jsonb_path_query('[1,2,3,{"b": [3,4,5]}]', 'strict $.*');
>> I propose adding a similar test with explicitly specified lax mode: select jsonb_path_query('[1,2,3,{"b": [3,4,5]}]', 'lax $.*'); to show what lax mode is set by default.
>
> Very few of the other tests do so; I can add it if it’s important for this case, though.

Went ahead and added lax.

> @? suppresses a number of errors. Perhaps I should add a variant of the error-raising query that passes the silent arg, which would also suppress the error.

Added a variant where the silent param suppresses the error, too.

V2 attached and the PR updated:

  https://github.com/theory/postgres/pull/4/files

Best,

David




HI! Now it looks good for me.
Best regards, Stepan Neretin.

Re: Patch bug: Fix jsonpath .* on Arrays

From
Michael Paquier
Date:
On Thu, Jun 27, 2024 at 11:53:14AM +0700, Stepan Neretin wrote:
> HI! Now it looks good for me.

The tests of jsonb_jsonpath.sql include a lot of patterns for @? and
jsonb_path_query with the lax and strict modes, so shouldn't these
additions be grouped closer to the existing tests rather than added at
the end of the file?
--
Michael

Attachment

Re: Patch bug: Fix jsonpath .* on Arrays

From
"David E. Wheeler"
Date:
On Jun 27, 2024, at 04:17, Michael Paquier <michael@paquier.xyz> wrote:

> The tests of jsonb_jsonpath.sql include a lot of patterns for @? and
> jsonb_path_query with the lax and strict modes, so shouldn't these
> additions be grouped closer to the existing tests rather than added at
> the end of the file?

I think you could argue that they should go with other tests for array unwrapping, though it’s kind of mixed
throughout.But that’s more the bit I was testing; almost all the tests are lax, and it’s less the strict behavior to
testhere than the explicit behavior of unwrapping in lax mode. 

But ultimately I don’t care where they go, just that we have them.

D




Re: Patch bug: Fix jsonpath .* on Arrays

From
"David E. Wheeler"
Date:
On Jun 27, 2024, at 04:17, Michael Paquier <michael@paquier.xyz> wrote:

> The tests of jsonb_jsonpath.sql include a lot of patterns for @? and
> jsonb_path_query with the lax and strict modes, so shouldn't these
> additions be grouped closer to the existing tests rather than added at
> the end of the file?

I’ve moved them closer to other tests for unwrapping behavior in the attached updated and rebased v3 patch.

Best,

David




Attachment

Re: Patch bug: Fix jsonpath .* on Arrays

From
Stepan Neretin
Date:


On Mon, Jul 8, 2024 at 11:09 PM David E. Wheeler <david@justatheory.com> wrote:
On Jun 27, 2024, at 04:17, Michael Paquier <michael@paquier.xyz> wrote:

> The tests of jsonb_jsonpath.sql include a lot of patterns for @? and
> jsonb_path_query with the lax and strict modes, so shouldn't these
> additions be grouped closer to the existing tests rather than added at
> the end of the file?

I’ve moved them closer to other tests for unwrapping behavior in the attached updated and rebased v3 patch.

Best,

David




Hi! Looks good to me now! Please, register a patch in CF.
Best regards, Stepan Neretin.

Re: Patch bug: Fix jsonpath .* on Arrays

From
"David E. Wheeler"
Date:
On Jul 15, 2024, at 07:07, Stepan Neretin <sncfmgg@gmail.com> wrote:

> Hi! Looks good to me now! Please, register a patch in CF.
> Best regards, Stepan Neretin.

It’s here:

  https://commitfest.postgresql.org/48/5017/

Best,

David




Re: Patch bug: Fix jsonpath .* on Arrays

From
Michael Paquier
Date:
On Mon, Jul 15, 2024 at 10:29:32AM -0400, David E. Wheeler wrote:
> It’s here:
>
>   https://commitfest.postgresql.org/48/5017/

Sorry for the delay.  Finally came back to it, and applied the tests.
Thanks!

It was fun to see that HEAD was silenced with the first patch of this
thread that tweaked the behavior with arrays.

Regarding the comments, I have left them out for now.  That may be a
good start, but it also feels like we should do a much better job
overall with the area of jsonpath_exec.c.  One thing that may help as
a start is to reorganize the routines of the file and divide them into
sub-categories, or even go through a deeper refactoring to help
readers go through the existing 4.5k lines of code that are in this
single file..
--
Michael

Attachment

Re: Patch bug: Fix jsonpath .* on Arrays

From
"David E. Wheeler"
Date:
On Jul 19, 2024, at 01:42, Michael Paquier <michael@paquier.xyz> wrote:

> Sorry for the delay.  Finally came back to it, and applied the tests.
> Thanks!

Awesome, thank you!

> It was fun to see that HEAD was silenced with the first patch of this
> thread that tweaked the behavior with arrays.

Uh, what? Sorry I don’t follow.

> Regarding the comments, I have left them out for now.  That may be a
> good start, but it also feels like we should do a much better job
> overall with the area of jsonpath_exec.c.

I put them in because it took me a bit to track down that they were among the implementors of JsonPathGetVarCallback as
Iwas porting the code. 

> One thing that may help as
> a start is to reorganize the routines of the file and divide them into
> sub-categories, or even go through a deeper refactoring to help
> readers go through the existing 4.5k lines of code that are in this
> single file..

After I got all the tests passing in the port to Go, I split it up into 14 implementation and 15 test files (with all
ofjsonb_jsonpath.(sql.out) in one file). Was much easier to reason about that way. 

https://github.com/theory/sqljson/tree/main/path/exec

Best,

David





Re: Patch bug: Fix jsonpath .* on Arrays

From
Michael Paquier
Date:
On Fri, Jul 19, 2024 at 09:49:50AM -0400, David E. Wheeler wrote:
>> It was fun to see that HEAD was silenced with the first patch of this
>> thread that tweaked the behavior with arrays.
>
> Uh, what? Sorry I don’t follow.

What I mean is that the main regression test suite did not complain on
your original patch posted here:
https://www.postgresql.org/message-id/A95346F9-6147-46E0-809E-532A485D71D6%40justatheory.com

But the new tests showed a difference, and that's enough for me to see
value in the new tests.
--
Michael

Attachment

Re: Patch bug: Fix jsonpath .* on Arrays

From
"David E. Wheeler"
Date:
On Jul 21, 2024, at 20:54, Michael Paquier <michael@paquier.xyz> wrote:

> What I mean is that the main regression test suite did not complain on
> your original patch posted here:
> https://www.postgresql.org/message-id/A95346F9-6147-46E0-809E-532A485D71D6%40justatheory.com
>
> But the new tests showed a difference, and that's enough for me to see
> value in the new tests.

Oh, got it, nice!

Thanks,

David