Re: Q: Escapes in jsonpath Idents - Mailing list pgsql-hackers

From Erik Wienhold
Subject Re: Q: Escapes in jsonpath Idents
Date
Msg-id kayzzqmh2com3uokglxvpnsikykqi64c727tdhmigw3j7uotxd@mbvdxkudjz5o
Whole thread Raw
In response to Re: Q: Escapes in jsonpath Idents  ("David E. Wheeler" <david@justatheory.com>)
Responses Re: Q: Escapes in jsonpath Idents
List pgsql-hackers
On 2024-04-24 13:52 +0200, David E. Wheeler wrote:
> On Apr 24, 2024, at 05:51, Peter Eisentraut <peter@eisentraut.org> wrote:
> 
> >    A <JSON path identifier> is classified as follows.
> > 
> >    Case:
> > 
> >    a) A <JSON path identifier> that is a <dollar sign> is a <JSON
> >       path context variable>.
> > 
> >    b) A <JSON path identifier> that begins with <dollar sign> is a
> >       <JSON path named variable>.
> > 
> >    c) Otherwise, a <JSON path identifier> is a <JSON path key name>.
> > 
> > Does this help?  I wasn't following all the discussion to see if
> > there is anything wrong with the implementation.

Thanks Peter!  But what is the definition of the entire path expression?
Perhaps something like:

    <JSON path> ::= <JSON path identifier> { "." <JSON path identifier> }

That would imply that "$.$foo" is a valid path that accesses a variable
member (but I guess the path evaluation is also specified somewhere).

Does it say anything about double-quoted accessors?  In table 8.25[1] we
allow member accessor ."$varname" and it says "If the key name matches
some named variable starting with $ or does not meet the JavaScript
rules for an identifier, it must be enclosed in double quotes to make it
a string literal."

What bugs me about this description, after reading it a couple of times,
is that it's not clear what is meant by ."$varname".  It could mean two
things: (1) the double-quoting masks $varname in order to not interpret
those characters as a variable or (2) an interpolated string that
resolves $varname and yields a dynamic member accessor.

The current implementation supports (1), i.e., ."$foo" does not refer to
variable foo but the actual property "$foo":

    => select jsonb_path_query('{"$foo":123,"bar":456}', '$."$foo"', '{"foo":"bar"}');
     jsonb_path_query
    ------------------
     123
    (1 row)

Under case (2) I'd expect that query to return 456 (because $foo
resolves to "bar").  (Similar to how psql would resolve :'foo' to
'bar'.)

Variables already work in array accessors and table 8.25 says that "The
specified index can be an integer, as well as an expression returning a
single numeric value [...]".  A variable is such an expression.

    => select jsonb_path_query('[2,3,5]', '$[$i]', '{"i":1}');
     jsonb_path_query
    ------------------
     3
    (1 row)

So I'd expect a similar behavior for member accessors as well when
seeing ."$varname" in the same table.

> Yes, it does, as it ties the special meaning of the dollar sign to the
> *beginning* of an expression. So it makes sense that this would be an
> error:
> 
> david=# select '$.$foo'::jsonpath;
> ERROR: syntax error at or near "$foo" of jsonpath input
> LINE 1: select '$.$foo'::jsonpath;
>                ^
> But I’m less sure when a dollar sign is used in the *middle* (or end)
> of a json path identifier:
> 
> david=# select '$.xx$foo'::jsonpath;
> ERROR:  syntax error at or near "$foo" of jsonpath input
> LINE 1: select '$.xx$foo'::jsonpath;
>                ^
> Perhaps that should be valid?

Yes, I think so.  That would be case C in the spec excerpt provided by
Peter.  So it's just a key name that happens to contain (but not start
with) the dollar sign.

[1] https://www.postgresql.org/docs/current/datatype-json.html#TYPE-JSONPATH-ACCESSORS

-- 
Erik



pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: some additional (small) problems with pg_combinebackup and tablespaces
Next
From: Bruce Momjian
Date:
Subject: Re: Statistics Import and Export