Re: Avoiding possible future conformance headaches in JSON work - Mailing list pgsql-hackers

From Oleg Bartunov
Subject Re: Avoiding possible future conformance headaches in JSON work
Date
Msg-id CAF4Au4yQudrY31HtEnUHAcW6rRvZfut3gzzSK6=M9cT_PyeRzQ@mail.gmail.com
Whole thread Raw
In response to Avoiding possible future conformance headaches in JSON work  (Chapman Flack <chap@anastigmatix.net>)
Responses Re: Avoiding possible future conformance headaches in JSON work
List pgsql-hackers


On Sat, 1 Jun 2019, 16:41 Chapman Flack, <chap@anastigmatix.net> wrote:
Hi,

We had a short conversation about this on Friday but I didn't have time
to think of a constructive suggestion, and now I've had more time to
think about it.

Regarding the proposed PG 13 jsonpath extensions (array, map, and
sequence constructors, lambdas, map/fold/reduce, user-defined
functions), literally all this stuff is in XPath/XQuery 3.1, and
clearly the SQL committee is imitating XPath/XQuery in the design
of jsonpath.

Therefore it would not be surprising at all if the committee eventually
adds those features in jsonpath. At that point, if the syntax matches
what we've added, we are happy, and if not, we have a multi-year,
multi-release, standard_conforming_strings-style headache.

So, a few ideas fall out....

First, with Peter being a participant, if there are any rumblings in the
SQL committee about adding those features, we should know the proposed
syntax as soon as we can and try to follow that.

AFAIK, there is rumour about 'native json data type' and 'dot style syntax' for json, but not about jsonpath.


If such rumblings are entirely absent, we should see what we can do to
start some, proposing the syntax we've got.

In either case, perhaps we should immediately add a way to identify a
jsonpath as being PostgreSQL-extended. Maybe a keyword 'pg' that can
be accepted at the start in addition to any lax/strict, so you could
have 'pg lax $.map(x => x + 10)'.

This is exactly what we were thinking about !

If we initially /require/ 'pg' for the extensions to be recognized, then
we can relax the requirement for whichever ones later appear in the spec
using the same syntax. If they appear in the spec with a different
syntax, then by requiring 'pg' already for our variant, we already have
avoided the standard_conforming_strings kind of multi-release
reconciliation effort.

In the near term, there is already one such potential conflict in
12beta: the like_regex using POSIX REs instead of XQuery ones as the
spec requires. Of course we don't currently have an XQuery regex
engine, but if we ever have one, we then face a headache if we want to
move jsonpath toward using it. (Ties in to conversation [1].)

Maybe we could avoid that by recognizing now an extra P in flags, to
specify a POSIX re. Or, as like_regex has a named-parameter-like
syntax--like_regex("abc" flag "i")--perhaps 'posix' should just be
an extra keyword in that grammar: like-regex("abc" posix). That would
be safe from the committee adding a P flag that means something else.

The conservative approach would be to simply require the 'posix' keyword
in all cases now, simply because we don't have the XQuery regex engine.

Alternatively, if there's a way to analyze a regex for the use of any
constructs with different meanings in POSIX and XQuery REs (and if
that's appreciably easier than writing an XQuery regex engine), then
the 'posix' keyword could be required only when it matters. But the
conservative approach sounds easier, and sufficient. The finer-grained
analysis would have to catch not just constructs that are in one RE
style and not the other, but any subtleties in semantics, and I
certainly wouldn't trust myself to write that.

We didn't think about regex, I don't know anybody working on xquery. 

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: pgsql: Avoid spurious deadlocks when upgrading a tuple lock
Next
From: Chapman Flack
Date:
Subject: Re: Avoiding possible future conformance headaches in JSON work