Re: Define jsonpath functions as stable - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Define jsonpath functions as stable
Date
Msg-id 10777.1568841140@sss.pgh.pa.us
Whole thread Raw
In response to Re: Define jsonpath functions as stable  ("Jonathan S. Katz" <jkatz@postgresql.org>)
Responses Re: Define jsonpath functions as stable
List pgsql-hackers
"Jonathan S. Katz" <jkatz@postgresql.org> writes:
> On 9/17/19 6:40 PM, Tom Lane wrote:
>> After a re-read of the XQuery spec, it seems to me that the character
>> entry form that they have and we don't is actually "&#NNNN;" like
>> HTML, rather than just "#NN".  Can anyone double-check that?

> Clicking through the XQuery spec eventual got me to here[1] (which warns
> me that its out of date, but that is what its "current" specs linked me
> to), which describes being able to use "&#[0-9]+;" and "&#[0-9a-fA-F]+;"
> to specify characters (which I recognize as a character escape from
> HTML, XML et al.).

After further reading, it seems like what that text is talking about
is not actually a regex feature, but an outgrowth of the fact that
the regex pattern is being expressed as a string literal in a language
for which XML character entities are a native aspect of the string
literal syntax.  So it looks to me like the entities get folded to
raw characters in a string-literal parser before the regex engine
ever sees them.

As such, I think this doesn't apply to SQL/JSON.  The SQL/JSON spec
seems to defer to Javascript/ECMAscript for syntax details, and
in either of those languages you have backslash escape sequences
for writing weird characters, *not* XML entities.  You certainly
wouldn't have use of such entities in a native implementation of
LIKE_REGEX in SQL.

So now I'm thinking we can just remove the handwaving about entities.
On the other hand, this points up a large gap in our docs about
SQL/JSON, which is that nowhere does it even address the question of
what the string literal syntax is within a path expression.  Much
less point out that that syntax is nothing like native SQL strings.
Good luck finding out from the docs that you'd better double any
backslashes you'd like to have in your regex --- but a moment's
testing proves that that is the case in our code as it stands.
Have we misread the spec badly enough to get this wrong?

            regards, tom lane



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: log bind parameter values on error
Next
From: Tom Lane
Date:
Subject: Re: Fix parsing of identifiers in jsonpath