Re: ❓ JSON Path Dot Precedence - Mailing list pgsql-hackers

From Erik Wienhold
Subject Re: ❓ JSON Path Dot Precedence
Date
Msg-id o2f7ccz6dywd7tqgbighi2t3waptrnklxzv54ntiq4l3kh4xpu@gwfzpeql2rb3
Whole thread Raw
In response to [MASSMAIL]❓ JSON Path Dot Precedence  ("David E. Wheeler" <david@justatheory.com>)
Responses Re: ❓ JSON Path Dot Precedence
List pgsql-hackers
On 2024-04-07 18:13 +0200, David E. Wheeler wrote:
> A question about the behavior of the JSON Path parser. The docs[1]
> have this to say about numbers:
> 
> >  Numeric literals in SQL/JSON path expressions follow JavaScript
> >  rules, which are different from both SQL and JSON in some minor
> >  details. For example, SQL/JSON path allows .1 and 1., which are
> >  invalid in JSON.
> 
> In other words, this is valid:
> 
> david=# select '2.'::jsonpath;
>  jsonpath 
> ----------
>  2
> 
> But this feature creates a bit of a conflict with the use of a dot for
> path expressions. Consider `0x2.p10`. How should that be parsed? As an
> invalid decimal expression ("trailing junk after numeric literal”), or
> as a valid integer 2 followed by the path segment “p10”? Here’s the
> parser’s answer:
> 
> david=# select '0x2.p10'::jsonpath;
>  jsonpath  
> -----------
>  (2)."p10"
> 
> So it would seem that, other things being equal, a path key expression
> (`.foo`) is slightly higher precedence than a decimal expression. Is
> that intentional/correct?

I guess jsonpath assumes that hex, octal, and binary literals are
integers.  So there's no ambiguity about any fractional part that might
follow.

> Discovered while writing my Go lexer and throwing all of Go’s floating
> point literal examples[2] at it and comparing to the Postgres path
> parser. Curiously, this is only an issue for 0x/0o/0b numeric
> expressions; a decimal expression does not behave in the same way:
> 
> david=# select '2.p10'::jsonpath;
> ERROR:  trailing junk after numeric literal at or near "2.p" of jsonpath input
> LINE 1: select '2.p10'::jsonpath;

It scans the decimal "2." and then finds junks "p10".

Works with a full decimal:

    test=# select '3.14.p10'::jsonpath;
       jsonpath
    --------------
     (3.14)."p10"
    (1 row)

And with extra whitespace to resolve the ambiguity:

    test=# select '2 .p10'::jsonpath;
     jsonpath
    -----------
     (2)."p10"
    (1 row)

> Which maybe seems a bit inconsistent.
> 
> Thoughts on what the “correct” behavior should be?

I'd say a member accessor after a number doesn't really make sense
because object keys are strings.  One could argue that path "$.2.p10"
should match JSON '{"2":{"p10":42}}', i.e. the numeric accessor is
converted to a string.  For example, in nodejs I can do:

    > var x = {2: {p10: 42}}
    > x[2].p10
    42

But that's JavaScript, not JSON.

Also, is there even a use case for path "0x2.p10"?  The path has to
start with "$" or ("@" in case of a filter expression), doesn't it?  And
it that case it doesn't parse:

    test=# select '$.0x2.p10'::jsonpath;
    ERROR:  trailing junk after numeric literal at or near ".0x" of jsonpath input
    LINE 1: select '$.0x2.p10'::jsonpath;

Even with extra whitespace:

    test=# select '$ . 0x2 . p10'::jsonpath;
    ERROR:  syntax error at or near "0x2" of jsonpath input
    LINE 1: select '$ . 0x2 . p10'::jsonpath;

Or should it behave like an array accessor?  Similar to:

    test=# select jsonb_path_query('[0,1,{"p10":42},3]', '$[0x2].p10'::jsonpath);
     jsonb_path_query
    ------------------
     42
    (1 row)

>   [1]: https://www.postgresql.org/docs/devel/datatype-json.html#DATATYPE-JSONPATH
>   [2]: https://tip.golang.org/ref/spec#Floating-point_literals

-- 
Erik



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: EXCLUDE COLLATE in CREATE/ALTER TABLE document
Next
From: Jeff Davis
Date:
Subject: Re: LogwrtResult contended spinlock