Avoiding possible future conformance headaches in JSON work - Mailing list pgsql-hackers
From | Chapman Flack |
---|---|
Subject | Avoiding possible future conformance headaches in JSON work |
Date | |
Msg-id | 5CF28EA0.80902@anastigmatix.net Whole thread Raw |
Responses |
Re: Avoiding possible future conformance headaches in JSON work
Re: Avoiding possible future conformance headaches in JSON work |
List | pgsql-hackers |
Hi, We had a short conversation about this on Friday but I didn't have time to think of a constructive suggestion, and now I've had more time to think about it. Regarding the proposed PG 13 jsonpath extensions (array, map, and sequence constructors, lambdas, map/fold/reduce, user-defined functions), literally all this stuff is in XPath/XQuery 3.1, and clearly the SQL committee is imitating XPath/XQuery in the design of jsonpath. Therefore it would not be surprising at all if the committee eventually adds those features in jsonpath. At that point, if the syntax matches what we've added, we are happy, and if not, we have a multi-year, multi-release, standard_conforming_strings-style headache. So, a few ideas fall out.... First, with Peter being a participant, if there are any rumblings in the SQL committee about adding those features, we should know the proposed syntax as soon as we can and try to follow that. If such rumblings are entirely absent, we should see what we can do to start some, proposing the syntax we've got. In either case, perhaps we should immediately add a way to identify a jsonpath as being PostgreSQL-extended. Maybe a keyword 'pg' that can be accepted at the start in addition to any lax/strict, so you could have 'pg lax $.map(x => x + 10)'. If we initially /require/ 'pg' for the extensions to be recognized, then we can relax the requirement for whichever ones later appear in the spec using the same syntax. If they appear in the spec with a different syntax, then by requiring 'pg' already for our variant, we already have avoided the standard_conforming_strings kind of multi-release reconciliation effort. In the near term, there is already one such potential conflict in 12beta: the like_regex using POSIX REs instead of XQuery ones as the spec requires. Of course we don't currently have an XQuery regex engine, but if we ever have one, we then face a headache if we want to move jsonpath toward using it. (Ties in to conversation [1].) Maybe we could avoid that by recognizing now an extra P in flags, to specify a POSIX re. Or, as like_regex has a named-parameter-like syntax--like_regex("abc" flag "i")--perhaps 'posix' should just be an extra keyword in that grammar: like-regex("abc" posix). That would be safe from the committee adding a P flag that means something else. The conservative approach would be to simply require the 'posix' keyword in all cases now, simply because we don't have the XQuery regex engine. Alternatively, if there's a way to analyze a regex for the use of any constructs with different meanings in POSIX and XQuery REs (and if that's appreciably easier than writing an XQuery regex engine), then the 'posix' keyword could be required only when it matters. But the conservative approach sounds easier, and sufficient. The finer-grained analysis would have to catch not just constructs that are in one RE style and not the other, but any subtleties in semantics, and I certainly wouldn't trust myself to write that. -Chap [1] https://www.postgresql.org/message-id/5CF2754F.7000702%40anastigmatix.net
pgsql-hackers by date: