Thread: Re: pgsql: Add more SQL/JSON constructor functions

Re: pgsql: Add more SQL/JSON constructor functions

From
Alvaro Herrera
Date:
On 2024-May-27, Alvaro Herrera wrote:

> > JSON_SERIALIZE()

I just noticed this behavior, which looks like a bug to me:

select json_serialize('{"a":1, "a":2}' returning varchar(5));
 json_serialize 
────────────────
 {"a":

I think this function should throw an error if the destination type
doesn't have room for the output json.  Otherwise, what good is the
serialization function?

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/



Re: pgsql: Add more SQL/JSON constructor functions

From
"David G. Johnston"
Date:
On Monday, May 27, 2024, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2024-May-27, Alvaro Herrera wrote:

> > JSON_SERIALIZE()

I just noticed this behavior, which looks like a bug to me:

select json_serialize('{"a":1, "a":2}' returning varchar(5));
 json_serialize
────────────────
 {"a":

I think this function should throw an error if the destination type
doesn't have room for the output json.  Otherwise, what good is the
serialization function?


It’s not a self-evident bug given that this is exactly how casting data to varchar(n) behaves as directed by the SQL Standard.

I'd probably leave the internal consistency and take the opportunity to educate the reader that text is the preferred type in PostgreSQL and, especially here, there is little good reason to use anything else.

David J.

Re: pgsql: Add more SQL/JSON constructor functions

From
Amit Langote
Date:
Hi Alvaro,

On Mon, May 27, 2024 at 7:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2024-May-27, Alvaro Herrera wrote:
>
> > > JSON_SERIALIZE()
>
> I just noticed this behavior, which looks like a bug to me:
>
> select json_serialize('{"a":1, "a":2}' returning varchar(5));
>  json_serialize
> ────────────────
>  {"a":
>
> I think this function should throw an error if the destination type
> doesn't have room for the output json.  Otherwise, what good is the
> serialization function?

I remember using the reasoning mentioned by David G when testing
json_query() et al with varchar(n), so you get:

select json_query('{"a":1, "a":2}', '$' returning varchar(5));
 json_query
------------
 {"a":
(1 row)

which is the same as:

select '{"a":1, "a":2}'::varchar(5);
 varchar
---------
 {"a":
(1 row)

Also,

select json_value('{"a":"abcdef"}', '$.a' returning varchar(5) error on error);
 json_value
------------
 abcde
(1 row)

This behavior comes from using COERCE_EXPLICIT_CAST when creating the
coercion expression to convert json_*() functions' argument to the
RETURNING type.

--
Thanks, Amit Langote



Re: pgsql: Add more SQL/JSON constructor functions

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> On Mon, May 27, 2024 at 7:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>> On 2024-May-27, Alvaro Herrera wrote:
>> I just noticed this behavior, which looks like a bug to me:
>>
>> select json_serialize('{"a":1, "a":2}' returning varchar(5));
>> json_serialize
>> ────────────────
>> {"a":
>>
>> I think this function should throw an error if the destination type
>> doesn't have room for the output json.  Otherwise, what good is the
>> serialization function?

> This behavior comes from using COERCE_EXPLICIT_CAST when creating the
> coercion expression to convert json_*() functions' argument to the
> RETURNING type.

Yeah, I too think this is a cast, and truncation is the spec-defined
behavior for casting to varchar with a specific length limit.  I see
little reason that this should work differently from

select json_serialize('{"a":1, "a":2}' returning text)::varchar(5);
 json_serialize
----------------
 {"a":
(1 row)


            regards, tom lane



Re: pgsql: Add more SQL/JSON constructor functions

From
Peter Eisentraut
Date:
On 29.05.24 18:44, Tom Lane wrote:
> Amit Langote <amitlangote09@gmail.com> writes:
>> On Mon, May 27, 2024 at 7:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>>> On 2024-May-27, Alvaro Herrera wrote:
>>> I just noticed this behavior, which looks like a bug to me:
>>>
>>> select json_serialize('{"a":1, "a":2}' returning varchar(5));
>>> json_serialize
>>> ────────────────
>>> {"a":
>>>
>>> I think this function should throw an error if the destination type
>>> doesn't have room for the output json.  Otherwise, what good is the
>>> serialization function?
> 
>> This behavior comes from using COERCE_EXPLICIT_CAST when creating the
>> coercion expression to convert json_*() functions' argument to the
>> RETURNING type.
> 
> Yeah, I too think this is a cast, and truncation is the spec-defined
> behavior for casting to varchar with a specific length limit.  I see
> little reason that this should work differently from
> 
> select json_serialize('{"a":1, "a":2}' returning text)::varchar(5);
>   json_serialize
> ----------------
>   {"a":
> (1 row)

The SQL standard says essentially that the output of json_serialize() is 
some string that when parsed back in gives you an equivalent JSON value 
as the input.  That doesn't seem compatible with truncating the output.

If you want output truncation, you can of course use an actual cast. 
But it makes sense that the RETURNING clause is separate from that.




Re: pgsql: Add more SQL/JSON constructor functions

From
Tom Lane
Date:
Peter Eisentraut <peter@eisentraut.org> writes:
> On 29.05.24 18:44, Tom Lane wrote:
>> Yeah, I too think this is a cast, and truncation is the spec-defined
>> behavior for casting to varchar with a specific length limit.

> The SQL standard says essentially that the output of json_serialize() is 
> some string that when parsed back in gives you an equivalent JSON value 
> as the input.  That doesn't seem compatible with truncating the output.

Maybe you should take this up with the SQL committee?  If you don't
like our current behavior, then either you have to say that RETURNING
with a length-limited target type is illegal (which is problematic
for the spec, since they have no such type) or that the cast behaves
like an implicit cast, with errors for overlength input (which I find
to be an unintuitive definition for a construct that names the target
type explicitly).

> If you want output truncation, you can of course use an actual cast. 
> But it makes sense that the RETURNING clause is separate from that.

Are you trying to say that the RETURNING clause's specified type
isn't the actual output type?  I can't buy that either.

Again, if you think our existing behavior isn't right, I think
it's a problem for the SQL committee not us.

            regards, tom lane



Re: pgsql: Add more SQL/JSON constructor functions

From
Peter Eisentraut
Date:
On 02.06.24 21:46, Tom Lane wrote:
> If you don't
> like our current behavior, then either you have to say that RETURNING
> with a length-limited target type is illegal (which is problematic
> for the spec, since they have no such type) or that the cast behaves
> like an implicit cast, with errors for overlength input (which I find
> to be an unintuitive definition for a construct that names the target
> type explicitly).

It asks for the latter behavior, essentially (but it's not defined in 
terms of casts).  It says:

"""
ii) Let JV be an implementation-dependent (UV097) value of type TT and 
encoding ENC such that these two conditions hold:

1) JV is a JSON text.

2) When the General Rules of Subclause 9.42, “Parsing JSON text”, are 
applied with JV as JSON TEXT, FO as FORMAT OPTION, and WITHOUT UNIQUE 
KEYS as UNIQUENESS CONSTRAINT; let CST be the STATUS and let CSJI be the 
SQL/JSON ITEM returned from the application of those General Rules, CST 
is successful completion (00000) and CSJI is an SQL/JSON item that is 
equivalent to SJI.

If there is no such JV, then let ST be the exception condition: data 
exception — invalid JSON text (22032).

iii) If JV is longer than the length or maximum length of TT, then an 
exception condition is raised: data exception — string data, right 
truncation (22001).
"""

Oracle also behaves accordingly:

SQL> select json_serialize('{"a":1, "a":2}' returning varchar2(20)) from 
dual;

JSON_SERIALIZE('{"A"
--------------------
{"a":1,"a":2}

SQL> select json_serialize('{"a":1, "a":2}' returning varchar2(5)) from 
dual;
select json_serialize('{"a":1, "a":2}' returning varchar2(5)) from dual
                                                                    *
ERROR at line 1:
ORA-40478: output value too large (maximum: 5)
JZN-00018: Input to serializer is too large
Help: https://docs.oracle.com/error-help/db/ora-40478/


As opposed to:

SQL> select cast(json_serialize('{"a":1, "a":2}') as varchar2(5)) from dual;

CAST(
-----
{"a":




Re: pgsql: Add more SQL/JSON constructor functions

From
Tom Lane
Date:
Peter Eisentraut <peter@eisentraut.org> writes:
> On 02.06.24 21:46, Tom Lane wrote:
>> If you don't
>> like our current behavior, then either you have to say that RETURNING
>> with a length-limited target type is illegal (which is problematic
>> for the spec, since they have no such type) or that the cast behaves
>> like an implicit cast, with errors for overlength input (which I find
>> to be an unintuitive definition for a construct that names the target
>> type explicitly).

> It asks for the latter behavior, essentially (but it's not defined in 
> terms of casts).  It says:

Meh.  Who needs consistency?  But I guess the answer is to do what was
suggested earlier and change the code to use COERCE_IMPLICIT_CAST.

            regards, tom lane



Re: pgsql: Add more SQL/JSON constructor functions

From
Amit Langote
Date:
On Tue, Jun 4, 2024 at 2:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Peter Eisentraut <peter@eisentraut.org> writes:
> > On 02.06.24 21:46, Tom Lane wrote:
> >> If you don't
> >> like our current behavior, then either you have to say that RETURNING
> >> with a length-limited target type is illegal (which is problematic
> >> for the spec, since they have no such type) or that the cast behaves
> >> like an implicit cast, with errors for overlength input (which I find
> >> to be an unintuitive definition for a construct that names the target
> >> type explicitly).
>
> > It asks for the latter behavior, essentially (but it's not defined in
> > terms of casts).  It says:
>
> Meh.  Who needs consistency?  But I guess the answer is to do what was
> suggested earlier and change the code to use COERCE_IMPLICIT_CAST.

OK, will post a patch to do so in a new thread on -hackers.

--
Thanks, Amit Langote



Re: pgsql: Add more SQL/JSON constructor functions

From
Amit Langote
Date:
On Tue, Jun 4, 2024 at 7:03 PM Amit Langote <amitlangote09@gmail.com> wrote:
> On Tue, Jun 4, 2024 at 2:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Peter Eisentraut <peter@eisentraut.org> writes:
> > > On 02.06.24 21:46, Tom Lane wrote:
> > >> If you don't
> > >> like our current behavior, then either you have to say that RETURNING
> > >> with a length-limited target type is illegal (which is problematic
> > >> for the spec, since they have no such type) or that the cast behaves
> > >> like an implicit cast, with errors for overlength input (which I find
> > >> to be an unintuitive definition for a construct that names the target
> > >> type explicitly).
> >
> > > It asks for the latter behavior, essentially (but it's not defined in
> > > terms of casts).  It says:
> >
> > Meh.  Who needs consistency?  But I guess the answer is to do what was
> > suggested earlier and change the code to use COERCE_IMPLICIT_CAST.
>
> OK, will post a patch to do so in a new thread on -hackers.

Oops, didn't realize that this is already on -hackers.

Attached is a patch to use COERCE_IMPLICIT_CAST when the RETURNING
type specifies a length limit.

Given that this also affects JSON_OBJECT() et al that got added in
v16, maybe back-patching is in order but I'd like to hear opinions on
that.


--
Thanks, Amit Langote

Attachment

Re: pgsql: Add more SQL/JSON constructor functions

From
jian he
Date:
On Tue, Jun 18, 2024 at 5:02 PM Amit Langote <amitlangote09@gmail.com> wrote:
>
> On Tue, Jun 4, 2024 at 7:03 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > On Tue, Jun 4, 2024 at 2:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > Peter Eisentraut <peter@eisentraut.org> writes:
> > > > On 02.06.24 21:46, Tom Lane wrote:
> > > >> If you don't
> > > >> like our current behavior, then either you have to say that RETURNING
> > > >> with a length-limited target type is illegal (which is problematic
> > > >> for the spec, since they have no such type) or that the cast behaves
> > > >> like an implicit cast, with errors for overlength input (which I find
> > > >> to be an unintuitive definition for a construct that names the target
> > > >> type explicitly).
> > >
> > > > It asks for the latter behavior, essentially (but it's not defined in
> > > > terms of casts).  It says:
> > >
> > > Meh.  Who needs consistency?  But I guess the answer is to do what was
> > > suggested earlier and change the code to use COERCE_IMPLICIT_CAST.
> >
> > OK, will post a patch to do so in a new thread on -hackers.
>
> Oops, didn't realize that this is already on -hackers.
>
> Attached is a patch to use COERCE_IMPLICIT_CAST when the RETURNING
> type specifies a length limit.
>

hi.
i am a little confused.

here[1] tom says:
> Yeah, I too think this is a cast, and truncation is the spec-defined
> behavior for casting to varchar with a specific length limit.  I see
> little reason that this should work differently from
>
> select json_serialize('{"a":1, "a":2}' returning text)::varchar(5);
>  json_serialize
> ----------------
>  {"a":
> (1 row)

if i understand it correctly, and my english interpretation is fine.
i think tom means something like:

select json_serialize('{"a":1, "a":2}' returning text)::varchar(5) =
json_serialize('{"a":1, "a":2}' returning varchar(5));

should return true.
the master will return true, but apply your patch, the above query
will yield an error.


your patch will make domain and char(n) behavior inconsistent.
create domain char2 as char(2);
SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR);
SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR);


another example:
SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes
default '"aaa"'::jsonb ON ERROR);
same value (jsonb "aaa") error on error will yield error,
but `default expression on error` can coerce the value to char(2),
which looks a little bit inconsistent, I think.


------------------------------------------
current in ExecInitJsonExpr we have

if (jsexpr->coercion_expr)
...
else if (jsexpr->use_json_coercion)
...
else if (jsexpr->use_io_coercion)
...

do you think it is necessary to add following asserts:
Assert (!(jsexpr->coercion_expr &&  jsexpr->use_json_coercion))
Assert (!(jsexpr->coercion_expr &&  jsexpr->use_io_coercion))



[1] https://www.postgresql.org/message-id/3189.1717001075%40sss.pgh.pa.us



Re: pgsql: Add more SQL/JSON constructor functions

From
Amit Langote
Date:
Hi,

Thanks for taking a look.

On Fri, Jun 21, 2024 at 4:05 PM jian he <jian.universality@gmail.com> wrote:
> On Tue, Jun 18, 2024 at 5:02 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > On Tue, Jun 4, 2024 at 7:03 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > > On Tue, Jun 4, 2024 at 2:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > > Peter Eisentraut <peter@eisentraut.org> writes:
> > > > > On 02.06.24 21:46, Tom Lane wrote:
> > > > >> If you don't
> > > > >> like our current behavior, then either you have to say that RETURNING
> > > > >> with a length-limited target type is illegal (which is problematic
> > > > >> for the spec, since they have no such type) or that the cast behaves
> > > > >> like an implicit cast, with errors for overlength input (which I find
> > > > >> to be an unintuitive definition for a construct that names the target
> > > > >> type explicitly).
> > > >
> > > > > It asks for the latter behavior, essentially (but it's not defined in
> > > > > terms of casts).  It says:
> > > >
> > > > Meh.  Who needs consistency?  But I guess the answer is to do what was
> > > > suggested earlier and change the code to use COERCE_IMPLICIT_CAST.
> > >
> > > OK, will post a patch to do so in a new thread on -hackers.
> >
> > Oops, didn't realize that this is already on -hackers.
> >
> > Attached is a patch to use COERCE_IMPLICIT_CAST when the RETURNING
> > type specifies a length limit.
>
> hi.
> i am a little confused.
>
> here[1] tom says:
> > Yeah, I too think this is a cast, and truncation is the spec-defined
> > behavior for casting to varchar with a specific length limit.  I see
> > little reason that this should work differently from
> >
> > select json_serialize('{"a":1, "a":2}' returning text)::varchar(5);
> >  json_serialize
> > ----------------
> >  {"a":
> > (1 row)
>
> if i understand it correctly, and my english interpretation is fine.
> i think tom means something like:
>
> select json_serialize('{"a":1, "a":2}' returning text)::varchar(5) =
> json_serialize('{"a":1, "a":2}' returning varchar(5));
>
> should return true.
> the master will return true, but apply your patch, the above query
> will yield an error.

The RETURNING variant giving an error is what the standard asks us to
do apparently.  I read Tom's last message on this thread as agreeing
to that, even though hesitantly.  He can correct me if I got that
wrong.

> your patch will make domain and char(n) behavior inconsistent.
> create domain char2 as char(2);
> SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR);
> SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR);
>
>
> another example:
> SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes
> default '"aaa"'::jsonb ON ERROR);
> same value (jsonb "aaa") error on error will yield error,
> but `default expression on error` can coerce the value to char(2),
> which looks a little bit inconsistent, I think.

Interesting examples, thanks for sharing.

Attached updated version should take into account that typmod may be
hiding under domains.  Please test.

> ------------------------------------------
> current in ExecInitJsonExpr we have
>
> if (jsexpr->coercion_expr)
> ...
> else if (jsexpr->use_json_coercion)
> ...
> else if (jsexpr->use_io_coercion)
> ...
>
> do you think it is necessary to add following asserts:
> Assert (!(jsexpr->coercion_expr &&  jsexpr->use_json_coercion))
> Assert (!(jsexpr->coercion_expr &&  jsexpr->use_io_coercion))

Yeah, perhaps, but let's consider this independently please.

--
Thanks, Amit Langote

Attachment

Re: pgsql: Add more SQL/JSON constructor functions

From
Amit Langote
Date:
On Fri, Jun 21, 2024 at 10:48 PM Amit Langote <amitlangote09@gmail.com> wrote:
> On Fri, Jun 21, 2024 at 4:05 PM jian he <jian.universality@gmail.com> wrote:
> > hi.
> > i am a little confused.
> >
> > here[1] tom says:
> > > Yeah, I too think this is a cast, and truncation is the spec-defined
> > > behavior for casting to varchar with a specific length limit.  I see
> > > little reason that this should work differently from
> > >
> > > select json_serialize('{"a":1, "a":2}' returning text)::varchar(5);
> > >  json_serialize
> > > ----------------
> > >  {"a":
> > > (1 row)
> >
> > if i understand it correctly, and my english interpretation is fine.
> > i think tom means something like:
> >
> > select json_serialize('{"a":1, "a":2}' returning text)::varchar(5) =
> > json_serialize('{"a":1, "a":2}' returning varchar(5));
> >
> > should return true.
> > the master will return true, but apply your patch, the above query
> > will yield an error.
>
> The RETURNING variant giving an error is what the standard asks us to
> do apparently.  I read Tom's last message on this thread as agreeing
> to that, even though hesitantly.  He can correct me if I got that
> wrong.
>
> > your patch will make domain and char(n) behavior inconsistent.
> > create domain char2 as char(2);
> > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR);
> > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR);
> >
> >
> > another example:
> > SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes
> > default '"aaa"'::jsonb ON ERROR);
> > same value (jsonb "aaa") error on error will yield error,
> > but `default expression on error` can coerce the value to char(2),
> > which looks a little bit inconsistent, I think.
>
> Interesting examples, thanks for sharing.
>
> Attached updated version should take into account that typmod may be
> hiding under domains.  Please test.

I'd like to push this one tomorrow, barring objections.

I could use some advice on backpatching.  As I mentioned upthread,
this changes the behavior for JSON_OBJECT(), JSON_ARRAY(),
JSON_ARRAYAGG(), JSON_OBJECTAGG() too, which were added in v16.
Should this change be backpatched?  In general, what's our stance on
changes that cater to improving standard compliance, but are not
necessarily bugs.

--
Thanks, Amit Langote



Re: pgsql: Add more SQL/JSON constructor functions

From
jian he
Date:
On Wed, Jun 26, 2024 at 8:39 PM Amit Langote <amitlangote09@gmail.com> wrote:
>
> >
> > The RETURNING variant giving an error is what the standard asks us to
> > do apparently.  I read Tom's last message on this thread as agreeing
> > to that, even though hesitantly.  He can correct me if I got that
> > wrong.
> >
> > > your patch will make domain and char(n) behavior inconsistent.
> > > create domain char2 as char(2);
> > > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR);
> > > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR);
> > >
> > >
> > > another example:
> > > SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes
> > > default '"aaa"'::jsonb ON ERROR);
> > > same value (jsonb "aaa") error on error will yield error,
> > > but `default expression on error` can coerce the value to char(2),
> > > which looks a little bit inconsistent, I think.
> >
> > Interesting examples, thanks for sharing.
> >
> > Attached updated version should take into account that typmod may be
> > hiding under domains.  Please test.
>

SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default
'13' on error);
return
ERROR:  value too long for type character(2)
should return 13

I found out the source of the problem is in coerceJsonExprOutput
/*
* Use cast expression for domain types; we need CoerceToDomain here.
*/
if (get_typtype(returning->typid) != TYPTYPE_DOMAIN)
{
jsexpr->use_io_coercion = true;
return;
}

>
> I'd like to push this one tomorrow, barring objections.
>

Currently the latest patch available cannot be `git apply` cleanly.

@@ -464,3 +466,9 @@ SELECT JSON_QUERY(jsonb 'null', '$xyz' PASSING 1 AS xyz);
 SELECT JSON_EXISTS(jsonb '1', '$' DEFAULT 1 ON ERROR);
 SELECT JSON_VALUE(jsonb '1', '$' EMPTY ON ERROR);
 SELECT JSON_QUERY(jsonb '1', '$' TRUE ON ERROR);
+
+-- Test implicit coercion domain over fixed-legth type specified in RETURNING
+CREATE DOMAIN queryfuncs_char2 AS char(2) CHECK (VALUE NOT IN ('12'));
+SELECT JSON_QUERY(jsonb '123', '$' RETURNING queryfuncs_char2 ERROR ON ERROR);
+SELECT JSON_VALUE(jsonb '123', '$' RETURNING queryfuncs_char2 ERROR ON ERROR);
+SELECT JSON_VALUE(jsonb '12', '$' RETURNING queryfuncs_char2 ERROR ON ERROR);

cannot found `SELECT JSON_QUERY(jsonb '1', '$' TRUE ON ERROR);`  in
https://git.postgresql.org/cgit/postgresql.git/tree/src/test/regress/sql/sqljson_queryfuncs.sql



Re: pgsql: Add more SQL/JSON constructor functions

From
jian he
Date:
hi.
I have assembled a list of simple examples, some works (for comparison
sake), most not work
as intended.

CREATE DOMAIN queryfuncs_char2 AS char(2) CHECK (VALUE NOT IN ('12'));
CREATE DOMAIN queryfuncs_d_interval AS interval(2) CHECK (VALUE is not null);

SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default
'12' on error);
SELECT JSON_VALUE(jsonb '12', '$' RETURNING queryfuncs_char2 default
'11' on error);
SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default
'13' on error);
SELECT JSON_VALUE(jsonb '"111"', '$' RETURNING queryfuncs_char2
default '17' on error);
SELECT JSON_QUERY(jsonb '111', '$' RETURNING queryfuncs_char2 default
'14' on error);
SELECT JSON_QUERY(jsonb '111', '$' RETURNING queryfuncs_char2 omit
quotes default '15' on error);
SELECT JSON_QUERY(jsonb '111', '$' RETURNING queryfuncs_char2 keep
quotes default '16' on error);


SELECT JSON_VALUE(jsonb '"01:23:45.6789"', '$' RETURNING
queryfuncs_d_interval default '01:23:45.6789' on error);
SELECT JSON_VALUE(jsonb '"01:23:45.6789"', '$' RETURNING
queryfuncs_d_interval default '01:23:45.6789' on empty);
SELECT JSON_QUERY(jsonb '"01:23:45.6789"', '$' RETURNING
queryfuncs_d_interval default '01:23:45.6789' on error);
SELECT JSON_QUERY(jsonb '"01:23:45.6789"', '$' RETURNING
queryfuncs_d_interval default '01:23:45.6789' on empty);
above 4 queries fails, meaning  the changes you propose within
transformJsonBehavior is wrong?
i think it's because the COERCION_IMPLICIT cast from text to domain
queryfuncs_d_interval is not doable.


json_table seems also have problem with "exists" cast to other type, example:
SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a char(2) EXISTS
PATH '$.a'  ));
SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a queryfuncs_char2
EXISTS PATH '$.a'));
SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a queryfuncs_char2
EXISTS PATH '$.a' error on error));
SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a queryfuncs_char2
EXISTS PATH '$.a' error on empty));


----------------------------------------------------------------------------------------------------
SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default
'13' on error);
for the above example:
coerceJsonExprOutput, coerceJsonFuncExpr set the result datum coercion
node to RelabelType:
RelabelType is not error safe. so the above query will fail converting
text 111 to queryfuncs_char2
which is not what we want.

I think making coerceJsonExprOutput the following way can solve this problem.
your patch cannot apply cleanly, I just posted the actual code snippet
of coerceJsonExprOutput, not a diff file.

static void
coerceJsonExprOutput(ParseState *pstate, JsonExpr *jsexpr)
{
JsonReturning *returning = jsexpr->returning;
Node    *context_item = jsexpr->formatted_expr;
int default_typmod;
Oid default_typid;
bool omit_quotes =
jsexpr->op == JSON_QUERY_OP && jsexpr->omit_quotes;
Node    *coercion_expr = NULL;
int32 baseTypmod = returning->typmod;

Assert(returning);

/*
* Check for cases where the coercion should be handled at runtime, that
* is, without using a cast expression.
*/
if (jsexpr->op == JSON_VALUE_OP)
{
/*
* Use cast expression for domain types; we need CoerceToDomain here.
*/
if (get_typtype(returning->typid) != TYPTYPE_DOMAIN)
{
jsexpr->use_io_coercion = true;
return;
}
else
{
/* domain type, typmod > 0 can only use use_io_coercion */
(void) getBaseTypeAndTypmod(returning->typid, &baseTypmod);
if (baseTypmod > 0)
{
jsexpr->use_io_coercion = true;
return;
}
}
}
else if (jsexpr->op == JSON_QUERY_OP)
{
/*
* Cast functions from jsonb to the following types (jsonb_bool() et
* al) don't handle errors softly, so coerce either by calling
* json_populate_type() or the type's input function so that any
* errors are handled appropriately. The latter only if OMIT QUOTES is
* true.
*/
switch (returning->typid)
{
case BOOLOID:
case NUMERICOID:
case INT2OID:
case INT4OID:
case INT8OID:
case FLOAT4OID:
case FLOAT8OID:
if (jsexpr->omit_quotes)
jsexpr->use_io_coercion = true;
else
jsexpr->use_json_coercion = true;
return;
default:
break;
}

/*
* for returning domain type, we cannot use coercion expression.
* it may not be able to catch the error, for example RelabelType
* for we either use_io_coercion or use_json_coercion.
*/
if (get_typtype(returning->typid) == TYPTYPE_DOMAIN)
(void) getBaseTypeAndTypmod(returning->typid, &baseTypmod);

/*
* coerceJsonFuncExpr() creates implicit casts for types with typmod,
* which (if present) don't handle errors softly, so use runtime
* coercion.
*/
if (baseTypmod > 0)
{
if (jsexpr->omit_quotes)
jsexpr->use_io_coercion = true;
else
jsexpr->use_json_coercion = true;
return;
}
}
...
-------------------------------



Re: pgsql: Add more SQL/JSON constructor functions

From
Amit Langote
Date:
Hi,

On Wed, Jun 26, 2024 at 11:46 PM jian he <jian.universality@gmail.com> wrote:
> On Wed, Jun 26, 2024 at 8:39 PM Amit Langote <amitlangote09@gmail.com> wrote:
> >
> > >
> > > The RETURNING variant giving an error is what the standard asks us to
> > > do apparently.  I read Tom's last message on this thread as agreeing
> > > to that, even though hesitantly.  He can correct me if I got that
> > > wrong.
> > >
> > > > your patch will make domain and char(n) behavior inconsistent.
> > > > create domain char2 as char(2);
> > > > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR);
> > > > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR);
> > > >
> > > >
> > > > another example:
> > > > SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes
> > > > default '"aaa"'::jsonb ON ERROR);
> > > > same value (jsonb "aaa") error on error will yield error,
> > > > but `default expression on error` can coerce the value to char(2),
> > > > which looks a little bit inconsistent, I think.
> > >
> > > Interesting examples, thanks for sharing.
> > >
> > > Attached updated version should take into account that typmod may be
> > > hiding under domains.  Please test.
> >
>
> SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default
> '13' on error);
> return
> ERROR:  value too long for type character(2)
> should return 13
>
> I found out the source of the problem is in coerceJsonExprOutput
> /*
> * Use cast expression for domain types; we need CoerceToDomain here.
> */
> if (get_typtype(returning->typid) != TYPTYPE_DOMAIN)
> {
> jsexpr->use_io_coercion = true;
> return;
> }

Thanks for this test case and the analysis.  Yes, using a cast
expression for coercion to the RETURNING type generally seems to be a
source of many problems that could've been solved by fixing things so
that only use_io_coercion and use_json_coercion are enough to handle
all the cases.

I've attempted that in the attached 0001, which removes
JsonExpr.coercion_expr and a bunch of code around it.

0002 is now the original patch minus the changes to make
JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE() behave as we would like,
because the changes in 0001 covers them. The changes for JsonBehavior
expression coercion as they were in the last version of the patch are
still needed, but I decided to move those into 0001 so that the
changes for query functions are all in 0001 and those for constructors
in 0002.  It would be nice to get rid of that coerce_to_target_type()
call to coerce the "behavior expression" to RETURNING type, but I'm
leaving that as a task for another day.

--
Thanks, Amit Langote

Attachment

Re: pgsql: Add more SQL/JSON constructor functions

From
Amit Langote
Date:
On Thu, Jun 27, 2024 at 6:57 PM Amit Langote <amitlangote09@gmail.com> wrote:
> On Wed, Jun 26, 2024 at 11:46 PM jian he <jian.universality@gmail.com> wrote:
> > On Wed, Jun 26, 2024 at 8:39 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > >
> > > >
> > > > The RETURNING variant giving an error is what the standard asks us to
> > > > do apparently.  I read Tom's last message on this thread as agreeing
> > > > to that, even though hesitantly.  He can correct me if I got that
> > > > wrong.
> > > >
> > > > > your patch will make domain and char(n) behavior inconsistent.
> > > > > create domain char2 as char(2);
> > > > > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR);
> > > > > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR);
> > > > >
> > > > >
> > > > > another example:
> > > > > SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes
> > > > > default '"aaa"'::jsonb ON ERROR);
> > > > > same value (jsonb "aaa") error on error will yield error,
> > > > > but `default expression on error` can coerce the value to char(2),
> > > > > which looks a little bit inconsistent, I think.
> > > >
> > > > Interesting examples, thanks for sharing.
> > > >
> > > > Attached updated version should take into account that typmod may be
> > > > hiding under domains.  Please test.
> > >
> >
> > SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default
> > '13' on error);
> > return
> > ERROR:  value too long for type character(2)
> > should return 13
> >
> > I found out the source of the problem is in coerceJsonExprOutput
> > /*
> > * Use cast expression for domain types; we need CoerceToDomain here.
> > */
> > if (get_typtype(returning->typid) != TYPTYPE_DOMAIN)
> > {
> > jsexpr->use_io_coercion = true;
> > return;
> > }
>
> Thanks for this test case and the analysis.  Yes, using a cast
> expression for coercion to the RETURNING type generally seems to be a
> source of many problems that could've been solved by fixing things so
> that only use_io_coercion and use_json_coercion are enough to handle
> all the cases.
>
> I've attempted that in the attached 0001, which removes
> JsonExpr.coercion_expr and a bunch of code around it.
>
> 0002 is now the original patch minus the changes to make
> JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE() behave as we would like,
> because the changes in 0001 covers them. The changes for JsonBehavior
> expression coercion as they were in the last version of the patch are
> still needed, but I decided to move those into 0001 so that the
> changes for query functions are all in 0001 and those for constructors
> in 0002.  It would be nice to get rid of that coerce_to_target_type()
> call to coerce the "behavior expression" to RETURNING type, but I'm
> leaving that as a task for another day.

Updated 0001 to remove outdated references, remove some more unnecessary code.

--
Thanks, Amit Langote

Attachment

Re: pgsql: Add more SQL/JSON constructor functions

From
jian he
Date:
On Thu, Jun 27, 2024 at 7:48 PM Amit Langote <amitlangote09@gmail.com> wrote:
> >
> > I've attempted that in the attached 0001, which removes
> > JsonExpr.coercion_expr and a bunch of code around it.
> >
> > 0002 is now the original patch minus the changes to make
> > JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE() behave as we would like,
> > because the changes in 0001 covers them. The changes for JsonBehavior
> > expression coercion as they were in the last version of the patch are
> > still needed, but I decided to move those into 0001 so that the
> > changes for query functions are all in 0001 and those for constructors
> > in 0002.  It would be nice to get rid of that coerce_to_target_type()
> > call to coerce the "behavior expression" to RETURNING type, but I'm
> > leaving that as a task for another day.
>
> Updated 0001 to remove outdated references, remove some more unnecessary code.
>

i found some remaining references of "coercion_expr" should be removed.

src/include/nodes/primnodes.h
/* JsonExpr's collation, if coercion_expr is NULL. */


src/include/nodes/execnodes.h
/*
* Address of the step to coerce the result value of jsonpath evaluation
* to the RETURNING type using JsonExpr.coercion_expr.  -1 if no coercion
* is necessary or if either JsonExpr.use_io_coercion or
* JsonExpr.use_json_coercion is true.
*/
int jump_eval_coercion;

src/backend/jit/llvm/llvmjit_expr.c
/* coercion_expr code */
LLVMPositionBuilderAtEnd(b, b_coercion);
if (jsestate->jump_eval_coercion >= 0)
LLVMBuildBr(b, opblocks[jsestate->jump_eval_coercion]);
else
LLVMBuildUnreachable(b);


src/backend/executor/execExprInterp.c
/*
 * Checks if an error occurred either when evaluating JsonExpr.coercion_expr or
 * in ExecEvalJsonCoercion().  If so, this sets JsonExprState.error to trigger
 * the ON ERROR handling steps.
 */
void
ExecEvalJsonCoercionFinish(ExprState *state, ExprEvalStep *op)
{
}

if (jbv == NULL)
{
/* Will be coerced with coercion_expr, if any. */
*op->resvalue = (Datum) 0;
*op->resnull = true;
}


src/backend/executor/execExpr.c
/*
* Jump to coerce the NULL using coercion_expr if present.  Coercing NULL
* is only interesting when the RETURNING type is a domain whose
* constraints must be checked.  jsexpr->coercion_expr containing a
* CoerceToDomain node must have been set in that case.
*/

/*
* Jump to coerce the NULL using coercion_expr if present.  Coercing NULL
* is only interesting when the RETURNING type is a domain whose
* constraints must be checked.  jsexpr->coercion_expr containing a
* CoerceToDomain node must have been set in that case.
*/



Re: pgsql: Add more SQL/JSON constructor functions

From
Amit Langote
Date:
On Fri, Jun 28, 2024 at 3:14 PM jian he <jian.universality@gmail.com> wrote:
> On Thu, Jun 27, 2024 at 7:48 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > >
> > > I've attempted that in the attached 0001, which removes
> > > JsonExpr.coercion_expr and a bunch of code around it.
> > >
> > > 0002 is now the original patch minus the changes to make
> > > JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE() behave as we would like,
> > > because the changes in 0001 covers them. The changes for JsonBehavior
> > > expression coercion as they were in the last version of the patch are
> > > still needed, but I decided to move those into 0001 so that the
> > > changes for query functions are all in 0001 and those for constructors
> > > in 0002.  It would be nice to get rid of that coerce_to_target_type()
> > > call to coerce the "behavior expression" to RETURNING type, but I'm
> > > leaving that as a task for another day.
> >
> > Updated 0001 to remove outdated references, remove some more unnecessary code.
> >
>
> i found some remaining references of "coercion_expr" should be removed.
>
> src/include/nodes/primnodes.h
> /* JsonExpr's collation, if coercion_expr is NULL. */
>
>
> src/include/nodes/execnodes.h
> /*
> * Address of the step to coerce the result value of jsonpath evaluation
> * to the RETURNING type using JsonExpr.coercion_expr.  -1 if no coercion
> * is necessary or if either JsonExpr.use_io_coercion or
> * JsonExpr.use_json_coercion is true.
> */
> int jump_eval_coercion;
>
> src/backend/jit/llvm/llvmjit_expr.c
> /* coercion_expr code */
> LLVMPositionBuilderAtEnd(b, b_coercion);
> if (jsestate->jump_eval_coercion >= 0)
> LLVMBuildBr(b, opblocks[jsestate->jump_eval_coercion]);
> else
> LLVMBuildUnreachable(b);
>
>
> src/backend/executor/execExprInterp.c
> /*
>  * Checks if an error occurred either when evaluating JsonExpr.coercion_expr or
>  * in ExecEvalJsonCoercion().  If so, this sets JsonExprState.error to trigger
>  * the ON ERROR handling steps.
>  */
> void
> ExecEvalJsonCoercionFinish(ExprState *state, ExprEvalStep *op)
> {
> }
>
> if (jbv == NULL)
> {
> /* Will be coerced with coercion_expr, if any. */
> *op->resvalue = (Datum) 0;
> *op->resnull = true;
> }
>
>
> src/backend/executor/execExpr.c
> /*
> * Jump to coerce the NULL using coercion_expr if present.  Coercing NULL
> * is only interesting when the RETURNING type is a domain whose
> * constraints must be checked.  jsexpr->coercion_expr containing a
> * CoerceToDomain node must have been set in that case.
> */
>
> /*
> * Jump to coerce the NULL using coercion_expr if present.  Coercing NULL
> * is only interesting when the RETURNING type is a domain whose
> * constraints must be checked.  jsexpr->coercion_expr containing a
> * CoerceToDomain node must have been set in that case.
> */

Thanks for checking.

Will push the attached 0001 and 0002 shortly.

--
Thanks, Amit Langote

Attachment

Re: pgsql: Add more SQL/JSON constructor functions

From
Alvaro Herrera
Date:
> diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
> index 233b7b1cc9..df766cdec1 100644
> --- a/src/backend/parser/parse_expr.c
> +++ b/src/backend/parser/parse_expr.c
> @@ -3583,6 +3583,7 @@ coerceJsonFuncExpr(ParseState *pstate, Node *expr,
>      Node       *res;
>      int            location;
>      Oid            exprtype = exprType(expr);
> +    int32        baseTypmod = returning->typmod;
>  
>      /* if output type is not specified or equals to function type, return */
>      if (!OidIsValid(returning->typid) || returning->typid == exprtype)
> @@ -3611,10 +3612,19 @@ coerceJsonFuncExpr(ParseState *pstate, Node *expr,
>          return (Node *) fexpr;
>      }
>  
> +    /*
> +     * For domains, consider the base type's typmod to decide whether to setup
> +     * an implicit or explicit cast.
> +     */
> +    if (get_typtype(returning->typid) == TYPTYPE_DOMAIN)
> +        (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod);

I didn't review this patch in detail, but I just noticed this tiny bit
and wanted to say that I don't like this coding style where you
initialize a variable to a certain value, and much later you override it
with a completely different value.  It seems much clearer to leave it
uninitialized at first, and have both places that determine the value
together,

    if (get_typtype(returning->typid) == TYPTYPE_DOMAIN)
        (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod);
    else
        baseTypmod = returning->typmod;

Not only because in the domain case the initializer value is a downright
lie, but also because of considerations such as if you later add code
that uses the variable in between those two places, you'd be introducing
a bug in the domain case because it hasn't been set.  With the coding I
propose, the compiler immediately tells you that the initialization is
missing.


TBH I'm not super clear on why we decide on explicit or implicit cast
based on presence of a typmod.  Why isn't it better to always use an
implicit one?

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Small aircraft do not crash frequently ... usually only once!"
                                  (ponder, http://thedailywtf.com/)



Re: pgsql: Add more SQL/JSON constructor functions

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
>> +    /*
>> +     * For domains, consider the base type's typmod to decide whether to setup
>> +     * an implicit or explicit cast.
>> +     */
>> +    if (get_typtype(returning->typid) == TYPTYPE_DOMAIN)
>> +        (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod);

> TBH I'm not super clear on why we decide on explicit or implicit cast
> based on presence of a typmod.  Why isn't it better to always use an
> implicit one?

Hmm ... there are a bunch of existing places that seem to have similar
logic, but they are all in new-ish SQL/JSON functionality, and I would
not be surprised if they are all wrong.  parse_coerce.c is quite
opinionated about what a domain's typtypmod means (see comments in
coerce_type() for instance); see also the logic in coerce_to_domain:

     * If the domain applies a typmod to its base type, build the appropriate
     * coercion step.  Mark it implicit for display purposes, because we don't
     * want it shown separately by ruleutils.c; but the isExplicit flag passed
     * to the conversion function depends on the manner in which the domain
     * coercion is invoked, so that the semantics of implicit and explicit
     * coercion differ.  (Is that really the behavior we want?)

I don't think that this SQL/JSON behavior quite matches that.

While I'm bitching ... this coding style is bogus anyway:

>> +    if (get_typtype(returning->typid) == TYPTYPE_DOMAIN)
>> +        (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod);

because it results in two syscache lookups not one.  You are supposed
to apply getBaseTypeAndTypmod unconditionally, as is done everywhere
except in the SQL/JSON logic.  I am also wondering how it can possibly
be sensible to throw away the function result of getBaseTypeAndTypmod
in this context.

            regards, tom lane