I looked a bit at the parser additions, because there were some concerns
expressed that they are quite big.
It looks like the parser rules were mostly literally copied from the BNF
in the SQL standard. That's probably a reasonable place to start, but
now at the end, there is some room for simplification.
Attached are a few patches that apply on top of the 0003 patch. (I
haven't gotten to 0004 in detail yet.) Some explanations:
0001-Put-keywords-in-right-order.patch
This is just an unrelated cleanup.
0002-Remove-js_quotes-union-entry.patch
We usually don't want to put every single node type into the gram.y
%union. This one can be trivially removed.
0003-Move-some-code-from-gram.y-to-parse-analysis.patch
Code like this can be postponed to parse analysis, keeping gram.y
smaller. The error pointer loses a bit of precision, but I think that's
ok. (There is similar code in your 0004 patch, which could be similarly
moved.)
0004-Remove-JsonBehavior-stuff-from-union.patch
Similar to my 0002. This adds a few casts as a result, but that is the
typical style in gram.y.
0005-Get-rid-of-JsonBehaviorClause.patch
I think this two-level wrapping of the behavior clauses is both
confusing and overkill. I was trying to just list the on-empty and
on-error clauses separately in the top-level productions (JSON_VALUE
etc.), but that led to shift/reduce errors. So the existing rule
structure is probably ok. But we don't need a separate node type just
to combine two values and then unpack them again shortly thereafter. So
I just replaced all this with a list.
0006-Get-rid-of-JsonCommon.patch
This is an example where the SQL standard BNF is not sensible to apply
literally. I moved those clauses up directly into their callers, thus
removing one intermediate levels of rules and also nodes. Also, the
path name (AS name) stuff is only for JSON_TABLE, so it's not needed in
this patch. I removed it here, but it would have to be readded in your
0004 patch.
Another thing: In your patch, JSON_EXISTS has a RETURNING clause
(json_returning_clause_opt), but I don't see that in the standard, and
also not in the Oracle or Db2 docs. Where did this come from?
With these changes, I think the grammar complexity in your 0003 patch is
at an acceptable level. Similar simplification opportunities exist in
the 0004 patch, but I haven't worked on that yet. I suggest that you
focus on getting 0001..0003 committed around this commit fest and then
deal with 0004 in the next one. (Also split up the 0005 patch into the
pieces that apply to 0003 and 0004, respectively.)