Re: JSON Path and GIN Questions - Mailing list pgsql-hackers

From Erik Wienhold
Subject Re: JSON Path and GIN Questions
Date
Msg-id 1093370370.83403.1694902436862@office.mailbox.org
Whole thread Raw
In response to Re: JSON Path and GIN Questions  ("David E. Wheeler" <david@justatheory.com>)
Responses Re: JSON Path and GIN Questions
List pgsql-hackers
On 16/09/2023 22:26 CEST David E. Wheeler <david@justatheory.com> wrote:

> I’ve started work on this; there’s so much to learn! Here’s a new example
> that surprised me a bit. Using the GPS tracker example from the docs [1]
> loaded into a `:json` psql variable, this output of this query makes perfect
> sense to me:
>
> david=# select jsonb_path_query(:'json', '$.track.segments.location[*] ? (@ < 14)');
>  jsonb_path_query
> ------------------
>  13.4034
>  13.2635
>
> Because `[*]` selects all the values. This, however, I did not expect:
>
> david=# select jsonb_path_query(:'json', '$.track.segments.location ? (@[*] < 14)');
>  jsonb_path_query
> ------------------
>  13.4034
>  13.2635
> (2 rows)
>
> I had expected it to return two single-value arrays, instead:
>
>  [13.4034]
>  [13.2635]
>
> It appears that the filter expression is doing some sub-selection, too.
> Is that expected?

Looks like the effect of lax mode which may unwrap arrays when necessary [1].
The array unwrapping looks like the result of jsonb_array_elements().

It kinda works in strict mode:

    SELECT jsonb_path_query(:'json', 'strict $.track.segments[*].location ? (@[*] < 14)');

       jsonb_path_query
    -----------------------
     [47.763, 13.4034]
     [47.706, 13.2635]
    (2 rows)

But it does not remove elements from the matching arrays.  Which I don't even
expect here because the path specifies the location array as the object to be
returned.  The filter expression then only decides whether to return the
location array or not.  Nowhere in the docs does it say that the filter
expression itself removes any elements from a matched array.

Here's a query that filter's out individual array elements.  It's quite a
mouthful (especially to preserve the order of array elements):

    WITH location AS (
      SELECT loc, row_number() OVER () AS array_num
      FROM jsonb_path_query(:'json', 'strict $.track.segments[*].location') loc
    ),
    element AS (
      SELECT array_num, e.num AS elem_num, e.elem
      FROM location
        CROSS JOIN jsonb_array_elements(loc) WITH ORDINALITY AS e (elem, num)
    )
    SELECT jsonb_agg(elem ORDER BY elem_num)
    FROM element
    WHERE jsonb_path_exists(elem, '$ ? (@ < 14)')
    GROUP BY array_num;

       jsonb_agg
    ---------------
     [13.2635]
     [13.4034]
    (2 rows)

[1] https://www.postgresql.org/docs/current/functions-json.html#STRICT-AND-LAX-MODES

--
Erik



pgsql-hackers by date:

Previous
From: "David E. Wheeler"
Date:
Subject: Re: JSON Path and GIN Questions
Next
From: "David E. Wheeler"
Date:
Subject: Re: JSON Path and GIN Questions