remaining sql/json patches - Mailing list pgsql-hackers

From Amit Langote
Subject remaining sql/json patches
Date
Msg-id CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
Whole thread Raw
Responses Re: remaining sql/json patches
List pgsql-hackers
Hello.

I'm starting a new thread for $subject per Alvaro's suggestion at [1].

So the following sql/json things still remain to be done:

* sql/json query functions:
    json_exists()
    json_query()
    json_value()

* other sql/json functions:
    json()
    json_scalar()
    json_serialize()

* finally:
    json_table

Attached is the rebased patch for the 1st part.

It also addresses Alvaro's review comments on Apr 4, though see my
comments below.

On Tue, Apr 4, 2023 at 9:36 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> On 2023-Apr-04, Amit Langote wrote:
> > On Tue, Apr 4, 2023 at 2:16 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > > - the gram.y solution to the "ON ERROR/ON EMPTY" clauses is quite ugly.
> > >   I think we could make that stuff use something similar to
> > >   ConstraintAttributeSpec with an accompanying post-processing function.
> > >   That would reduce the number of ad-hoc hacks, which seem excessive.
>>
> > Do you mean the solution involving the JsonBehavior node?
>
> Right.  It has spilled as the separate on_behavior struct in the core
> parser %union in addition to the raw jsbehavior, which is something
> we've gone 30 years without having, and I don't see why we should start
> now.

I looked into trying to make this look like ConstraintAttributeSpec
but came to the conclusion that that's not quite doable in this case.
A "behavior" cannot be represented simply as an integer flag, because
there's `DEFAULT a_expr` to fit in, so it's got to be this
JsonBehavior node.  However...

> This stuff is terrible:
>
> json_exists_error_clause_opt:
>             json_exists_error_behavior ON ERROR_P       { $$ = $1; }
>             | /* EMPTY */                               { $$ = NULL; }
>         ;
>
> json_exists_error_behavior:
>             ERROR_P     { $$ = makeJsonBehavior(JSON_BEHAVIOR_ERROR, NULL); }
>             | TRUE_P        { $$ = makeJsonBehavior(JSON_BEHAVIOR_TRUE, NULL); }
>             | FALSE_P       { $$ = makeJsonBehavior(JSON_BEHAVIOR_FALSE, NULL); }
>             | UNKNOWN       { $$ = makeJsonBehavior(JSON_BEHAVIOR_UNKNOWN, NULL); }
>         ;
>
> json_value_behavior:
>             NULL_P      { $$ = makeJsonBehavior(JSON_BEHAVIOR_NULL, NULL); }
>             | ERROR_P       { $$ = makeJsonBehavior(JSON_BEHAVIOR_ERROR, NULL); }
>             | DEFAULT a_expr    { $$ = makeJsonBehavior(JSON_BEHAVIOR_DEFAULT, $2); }
>         ;
>
> json_value_on_behavior_clause_opt:
>             json_value_behavior ON EMPTY_P
>                                     { $$.on_empty = $1; $$.on_error = NULL; }
>             | json_value_behavior ON EMPTY_P json_value_behavior ON ERROR_P
>                                     { $$.on_empty = $1; $$.on_error = $4; }
>             | json_value_behavior ON ERROR_P
>                                     { $$.on_empty = NULL; $$.on_error = $1; }
>             |  /* EMPTY */
>                                     { $$.on_empty = NULL; $$.on_error = NULL; }
>         ;
>
> json_query_behavior:
>             ERROR_P     { $$ = makeJsonBehavior(JSON_BEHAVIOR_ERROR, NULL); }
>             | NULL_P        { $$ = makeJsonBehavior(JSON_BEHAVIOR_NULL, NULL); }
>             | EMPTY_P ARRAY { $$ = makeJsonBehavior(JSON_BEHAVIOR_EMPTY_ARRAY, NULL); }
>             /* non-standard, for Oracle compatibility only */
>             | EMPTY_P       { $$ = makeJsonBehavior(JSON_BEHAVIOR_EMPTY_ARRAY, NULL); }
>             | EMPTY_P OBJECT_P  { $$ = makeJsonBehavior(JSON_BEHAVIOR_EMPTY_OBJECT, NULL); }
>             | DEFAULT a_expr    { $$ = makeJsonBehavior(JSON_BEHAVIOR_DEFAULT, $2); }
>         ;
>
> json_query_on_behavior_clause_opt:
>             json_query_behavior ON EMPTY_P
>                                     { $$.on_empty = $1; $$.on_error = NULL; }
>             | json_query_behavior ON EMPTY_P json_query_behavior ON ERROR_P
>                                     { $$.on_empty = $1; $$.on_error = $4; }
>             | json_query_behavior ON ERROR_P
>                                     { $$.on_empty = NULL; $$.on_error = $1; }
>             |  /* EMPTY */
>                                     { $$.on_empty = NULL; $$.on_error = NULL; }
>         ;
>
> Surely this can be made cleaner.

...I've managed to reduce the above down to:

    MergeWhenClause *mergewhen;
    struct KeyActions *keyactions;
    struct KeyAction *keyaction;
+   JsonBehavior *jsbehavior;
...
+%type <jsbehavior> json_value_behavior
+                   json_query_behavior
+                   json_exists_behavior
...
+json_query_behavior:
+           ERROR_P     { $$ = makeJsonBehavior(JSON_BEHAVIOR_ERROR, NULL); }
+           | NULL_P        { $$ = makeJsonBehavior(JSON_BEHAVIOR_NULL, NULL); }
+           | DEFAULT a_expr    { $$ =
makeJsonBehavior(JSON_BEHAVIOR_DEFAULT, $2); }
+           | EMPTY_P ARRAY { $$ =
makeJsonBehavior(JSON_BEHAVIOR_EMPTY_ARRAY, NULL); }
+           | EMPTY_P OBJECT_P  { $$ =
makeJsonBehavior(JSON_BEHAVIOR_EMPTY_OBJECT, NULL); }
+           /* non-standard, for Oracle compatibility only */
+           | EMPTY_P       { $$ =
makeJsonBehavior(JSON_BEHAVIOR_EMPTY_ARRAY, NULL); }
+       ;
+
+json_exists_behavior:
+           ERROR_P     { $$ = makeJsonBehavior(JSON_BEHAVIOR_ERROR, NULL); }
+           | TRUE_P        { $$ = makeJsonBehavior(JSON_BEHAVIOR_TRUE, NULL); }
+           | FALSE_P       { $$ =
makeJsonBehavior(JSON_BEHAVIOR_FALSE, NULL); }
+           | UNKNOWN       { $$ =
makeJsonBehavior(JSON_BEHAVIOR_UNKNOWN, NULL); }
+       ;
+
+json_value_behavior:
+           NULL_P      { $$ = makeJsonBehavior(JSON_BEHAVIOR_NULL, NULL); }
+           | ERROR_P       { $$ =
makeJsonBehavior(JSON_BEHAVIOR_ERROR, NULL); }
+           | DEFAULT a_expr    { $$ =
makeJsonBehavior(JSON_BEHAVIOR_DEFAULT, $2); }
+       ;

Though, that does mean that there are now more rules for
func_expr_common_subexpr to implement the variations of ON ERROR, ON
EMPTY clauses for each of JSON_EXISTS, JSON_QUERY, and JSON_VALUE.

> By the way -- that comment about clauses being non-standard, can you
> spot exactly *which* clauses that comment applies to?

I've moved comment as shown above to make clear that a bare EMPTY_P is
needed for Oracle compatibility

On Tue, Apr 4, 2023 at 2:16 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> - the changes in formatting.h have no explanation whatsoever.  At the
>   very least, the new function should have a comment in the .c file.
>   (And why is it at end of file?  I bet there's a better location)

Apparently, the newly exported routine is needed in the JSON-specific
subroutine for the planner's contain_mutable_functions_walker(), to
check if a JsonExpr's path_spec contains any timezone-dependent
constant.  In the attached, I've changed the newly exported function's
name as follows:

datetime_format_flags -> datetime_format_has_tz

which let me do away with exporting those DCH_* constants in formatting.h.

> - some nasty hacks are being used in the ECPG grammar with no tests at
>   all.  It's easy to add a few lines to the .pgc file I added in prior
>   commits.

Ah, those ecpg.trailer changes weren't in the original commit that
added added SQL/JSON query functions (1a36bc9dba8ea), but came in
5f0adec2537d, 83f1c7b742e8 to fix some damage caused by the former's
making STRING a keyword.  If I don't include the ecpg.trailer bit,
test_informix.pgc fails, so I think the change is already covered.

> - Some functions in jsonfuncs.c have changed from throwing hard errors
>   into soft ones.  I think this deserves more commentary.

I've merged the delta patch I had posted earlier addressing this [2]
into the attached.

> - func.sgml: The new functions are documented in a separate table for no
>   reason that I can see.  Needs to be merged into one of the existing
>   tables.  I didn't actually review the docs.

Hmm, so we already have "SQL/JSON Testing Functions" that were
committed into v16 in a separate table (Table 9.48) under "9.16.1.
Processing and Creating JSON Data".  So, I don't see a problem with
adding "SQL/JSON Query Functions" in a separate table, though maybe it
should not be under the same sub-section.  Maybe under "9.16.2. The
SQL/JSON Path Language" is more appropriate?

I'll rebase and post the patches for "other sql/json functions" and
"json_table" shortly.

--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

[1] https://www.postgresql.org/message-id/20230503181732.26hx5ihbdkmzhlyw%40alvherre.pgsql
[2] https://www.postgresql.org/message-id/CA%2BHiwqHGghuFpxE%3DpfUFPT%2BZzKvHWSN4BcrWr%3DZRjd4i4qubfQ%40mail.gmail.com

Attachment

pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Initial Schema Sync for Logical Replication
Next
From: Amit Langote
Date:
Subject: Re: SQL/JSON revisited