Thread: Re: pgsql: Add more SQL/JSON constructor functions
On 2024-May-27, Alvaro Herrera wrote: > > JSON_SERIALIZE() I just noticed this behavior, which looks like a bug to me: select json_serialize('{"a":1, "a":2}' returning varchar(5)); json_serialize ──────────────── {"a": I think this function should throw an error if the destination type doesn't have room for the output json. Otherwise, what good is the serialization function? -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
On Monday, May 27, 2024, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2024-May-27, Alvaro Herrera wrote:
> > JSON_SERIALIZE()
I just noticed this behavior, which looks like a bug to me:
select json_serialize('{"a":1, "a":2}' returning varchar(5));
json_serialize
────────────────
{"a":
I think this function should throw an error if the destination type
doesn't have room for the output json. Otherwise, what good is the
serialization function?
It’s not a self-evident bug given that this is exactly how casting data to varchar(n) behaves as directed by the SQL Standard.
I'd probably leave the internal consistency and take the opportunity to educate the reader that text is the preferred type in PostgreSQL and, especially here, there is little good reason to use anything else.
David J.
Hi Alvaro, On Mon, May 27, 2024 at 7:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2024-May-27, Alvaro Herrera wrote: > > > > JSON_SERIALIZE() > > I just noticed this behavior, which looks like a bug to me: > > select json_serialize('{"a":1, "a":2}' returning varchar(5)); > json_serialize > ──────────────── > {"a": > > I think this function should throw an error if the destination type > doesn't have room for the output json. Otherwise, what good is the > serialization function? I remember using the reasoning mentioned by David G when testing json_query() et al with varchar(n), so you get: select json_query('{"a":1, "a":2}', '$' returning varchar(5)); json_query ------------ {"a": (1 row) which is the same as: select '{"a":1, "a":2}'::varchar(5); varchar --------- {"a": (1 row) Also, select json_value('{"a":"abcdef"}', '$.a' returning varchar(5) error on error); json_value ------------ abcde (1 row) This behavior comes from using COERCE_EXPLICIT_CAST when creating the coercion expression to convert json_*() functions' argument to the RETURNING type. -- Thanks, Amit Langote
Amit Langote <amitlangote09@gmail.com> writes: > On Mon, May 27, 2024 at 7:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> On 2024-May-27, Alvaro Herrera wrote: >> I just noticed this behavior, which looks like a bug to me: >> >> select json_serialize('{"a":1, "a":2}' returning varchar(5)); >> json_serialize >> ──────────────── >> {"a": >> >> I think this function should throw an error if the destination type >> doesn't have room for the output json. Otherwise, what good is the >> serialization function? > This behavior comes from using COERCE_EXPLICIT_CAST when creating the > coercion expression to convert json_*() functions' argument to the > RETURNING type. Yeah, I too think this is a cast, and truncation is the spec-defined behavior for casting to varchar with a specific length limit. I see little reason that this should work differently from select json_serialize('{"a":1, "a":2}' returning text)::varchar(5); json_serialize ---------------- {"a": (1 row) regards, tom lane
On 29.05.24 18:44, Tom Lane wrote: > Amit Langote <amitlangote09@gmail.com> writes: >> On Mon, May 27, 2024 at 7:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >>> On 2024-May-27, Alvaro Herrera wrote: >>> I just noticed this behavior, which looks like a bug to me: >>> >>> select json_serialize('{"a":1, "a":2}' returning varchar(5)); >>> json_serialize >>> ──────────────── >>> {"a": >>> >>> I think this function should throw an error if the destination type >>> doesn't have room for the output json. Otherwise, what good is the >>> serialization function? > >> This behavior comes from using COERCE_EXPLICIT_CAST when creating the >> coercion expression to convert json_*() functions' argument to the >> RETURNING type. > > Yeah, I too think this is a cast, and truncation is the spec-defined > behavior for casting to varchar with a specific length limit. I see > little reason that this should work differently from > > select json_serialize('{"a":1, "a":2}' returning text)::varchar(5); > json_serialize > ---------------- > {"a": > (1 row) The SQL standard says essentially that the output of json_serialize() is some string that when parsed back in gives you an equivalent JSON value as the input. That doesn't seem compatible with truncating the output. If you want output truncation, you can of course use an actual cast. But it makes sense that the RETURNING clause is separate from that.
Peter Eisentraut <peter@eisentraut.org> writes: > On 29.05.24 18:44, Tom Lane wrote: >> Yeah, I too think this is a cast, and truncation is the spec-defined >> behavior for casting to varchar with a specific length limit. > The SQL standard says essentially that the output of json_serialize() is > some string that when parsed back in gives you an equivalent JSON value > as the input. That doesn't seem compatible with truncating the output. Maybe you should take this up with the SQL committee? If you don't like our current behavior, then either you have to say that RETURNING with a length-limited target type is illegal (which is problematic for the spec, since they have no such type) or that the cast behaves like an implicit cast, with errors for overlength input (which I find to be an unintuitive definition for a construct that names the target type explicitly). > If you want output truncation, you can of course use an actual cast. > But it makes sense that the RETURNING clause is separate from that. Are you trying to say that the RETURNING clause's specified type isn't the actual output type? I can't buy that either. Again, if you think our existing behavior isn't right, I think it's a problem for the SQL committee not us. regards, tom lane
On 02.06.24 21:46, Tom Lane wrote: > If you don't > like our current behavior, then either you have to say that RETURNING > with a length-limited target type is illegal (which is problematic > for the spec, since they have no such type) or that the cast behaves > like an implicit cast, with errors for overlength input (which I find > to be an unintuitive definition for a construct that names the target > type explicitly). It asks for the latter behavior, essentially (but it's not defined in terms of casts). It says: """ ii) Let JV be an implementation-dependent (UV097) value of type TT and encoding ENC such that these two conditions hold: 1) JV is a JSON text. 2) When the General Rules of Subclause 9.42, “Parsing JSON text”, are applied with JV as JSON TEXT, FO as FORMAT OPTION, and WITHOUT UNIQUE KEYS as UNIQUENESS CONSTRAINT; let CST be the STATUS and let CSJI be the SQL/JSON ITEM returned from the application of those General Rules, CST is successful completion (00000) and CSJI is an SQL/JSON item that is equivalent to SJI. If there is no such JV, then let ST be the exception condition: data exception — invalid JSON text (22032). iii) If JV is longer than the length or maximum length of TT, then an exception condition is raised: data exception — string data, right truncation (22001). """ Oracle also behaves accordingly: SQL> select json_serialize('{"a":1, "a":2}' returning varchar2(20)) from dual; JSON_SERIALIZE('{"A" -------------------- {"a":1,"a":2} SQL> select json_serialize('{"a":1, "a":2}' returning varchar2(5)) from dual; select json_serialize('{"a":1, "a":2}' returning varchar2(5)) from dual * ERROR at line 1: ORA-40478: output value too large (maximum: 5) JZN-00018: Input to serializer is too large Help: https://docs.oracle.com/error-help/db/ora-40478/ As opposed to: SQL> select cast(json_serialize('{"a":1, "a":2}') as varchar2(5)) from dual; CAST( ----- {"a":
Peter Eisentraut <peter@eisentraut.org> writes: > On 02.06.24 21:46, Tom Lane wrote: >> If you don't >> like our current behavior, then either you have to say that RETURNING >> with a length-limited target type is illegal (which is problematic >> for the spec, since they have no such type) or that the cast behaves >> like an implicit cast, with errors for overlength input (which I find >> to be an unintuitive definition for a construct that names the target >> type explicitly). > It asks for the latter behavior, essentially (but it's not defined in > terms of casts). It says: Meh. Who needs consistency? But I guess the answer is to do what was suggested earlier and change the code to use COERCE_IMPLICIT_CAST. regards, tom lane
On Tue, Jun 4, 2024 at 2:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Eisentraut <peter@eisentraut.org> writes: > > On 02.06.24 21:46, Tom Lane wrote: > >> If you don't > >> like our current behavior, then either you have to say that RETURNING > >> with a length-limited target type is illegal (which is problematic > >> for the spec, since they have no such type) or that the cast behaves > >> like an implicit cast, with errors for overlength input (which I find > >> to be an unintuitive definition for a construct that names the target > >> type explicitly). > > > It asks for the latter behavior, essentially (but it's not defined in > > terms of casts). It says: > > Meh. Who needs consistency? But I guess the answer is to do what was > suggested earlier and change the code to use COERCE_IMPLICIT_CAST. OK, will post a patch to do so in a new thread on -hackers. -- Thanks, Amit Langote
On Tue, Jun 4, 2024 at 7:03 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Tue, Jun 4, 2024 at 2:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Peter Eisentraut <peter@eisentraut.org> writes: > > > On 02.06.24 21:46, Tom Lane wrote: > > >> If you don't > > >> like our current behavior, then either you have to say that RETURNING > > >> with a length-limited target type is illegal (which is problematic > > >> for the spec, since they have no such type) or that the cast behaves > > >> like an implicit cast, with errors for overlength input (which I find > > >> to be an unintuitive definition for a construct that names the target > > >> type explicitly). > > > > > It asks for the latter behavior, essentially (but it's not defined in > > > terms of casts). It says: > > > > Meh. Who needs consistency? But I guess the answer is to do what was > > suggested earlier and change the code to use COERCE_IMPLICIT_CAST. > > OK, will post a patch to do so in a new thread on -hackers. Oops, didn't realize that this is already on -hackers. Attached is a patch to use COERCE_IMPLICIT_CAST when the RETURNING type specifies a length limit. Given that this also affects JSON_OBJECT() et al that got added in v16, maybe back-patching is in order but I'd like to hear opinions on that. -- Thanks, Amit Langote
Attachment
On Tue, Jun 18, 2024 at 5:02 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Tue, Jun 4, 2024 at 7:03 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Tue, Jun 4, 2024 at 2:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > Peter Eisentraut <peter@eisentraut.org> writes: > > > > On 02.06.24 21:46, Tom Lane wrote: > > > >> If you don't > > > >> like our current behavior, then either you have to say that RETURNING > > > >> with a length-limited target type is illegal (which is problematic > > > >> for the spec, since they have no such type) or that the cast behaves > > > >> like an implicit cast, with errors for overlength input (which I find > > > >> to be an unintuitive definition for a construct that names the target > > > >> type explicitly). > > > > > > > It asks for the latter behavior, essentially (but it's not defined in > > > > terms of casts). It says: > > > > > > Meh. Who needs consistency? But I guess the answer is to do what was > > > suggested earlier and change the code to use COERCE_IMPLICIT_CAST. > > > > OK, will post a patch to do so in a new thread on -hackers. > > Oops, didn't realize that this is already on -hackers. > > Attached is a patch to use COERCE_IMPLICIT_CAST when the RETURNING > type specifies a length limit. > hi. i am a little confused. here[1] tom says: > Yeah, I too think this is a cast, and truncation is the spec-defined > behavior for casting to varchar with a specific length limit. I see > little reason that this should work differently from > > select json_serialize('{"a":1, "a":2}' returning text)::varchar(5); > json_serialize > ---------------- > {"a": > (1 row) if i understand it correctly, and my english interpretation is fine. i think tom means something like: select json_serialize('{"a":1, "a":2}' returning text)::varchar(5) = json_serialize('{"a":1, "a":2}' returning varchar(5)); should return true. the master will return true, but apply your patch, the above query will yield an error. your patch will make domain and char(n) behavior inconsistent. create domain char2 as char(2); SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR); SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR); another example: SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes default '"aaa"'::jsonb ON ERROR); same value (jsonb "aaa") error on error will yield error, but `default expression on error` can coerce the value to char(2), which looks a little bit inconsistent, I think. ------------------------------------------ current in ExecInitJsonExpr we have if (jsexpr->coercion_expr) ... else if (jsexpr->use_json_coercion) ... else if (jsexpr->use_io_coercion) ... do you think it is necessary to add following asserts: Assert (!(jsexpr->coercion_expr && jsexpr->use_json_coercion)) Assert (!(jsexpr->coercion_expr && jsexpr->use_io_coercion)) [1] https://www.postgresql.org/message-id/3189.1717001075%40sss.pgh.pa.us
Hi, Thanks for taking a look. On Fri, Jun 21, 2024 at 4:05 PM jian he <jian.universality@gmail.com> wrote: > On Tue, Jun 18, 2024 at 5:02 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Tue, Jun 4, 2024 at 7:03 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > On Tue, Jun 4, 2024 at 2:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Peter Eisentraut <peter@eisentraut.org> writes: > > > > > On 02.06.24 21:46, Tom Lane wrote: > > > > >> If you don't > > > > >> like our current behavior, then either you have to say that RETURNING > > > > >> with a length-limited target type is illegal (which is problematic > > > > >> for the spec, since they have no such type) or that the cast behaves > > > > >> like an implicit cast, with errors for overlength input (which I find > > > > >> to be an unintuitive definition for a construct that names the target > > > > >> type explicitly). > > > > > > > > > It asks for the latter behavior, essentially (but it's not defined in > > > > > terms of casts). It says: > > > > > > > > Meh. Who needs consistency? But I guess the answer is to do what was > > > > suggested earlier and change the code to use COERCE_IMPLICIT_CAST. > > > > > > OK, will post a patch to do so in a new thread on -hackers. > > > > Oops, didn't realize that this is already on -hackers. > > > > Attached is a patch to use COERCE_IMPLICIT_CAST when the RETURNING > > type specifies a length limit. > > hi. > i am a little confused. > > here[1] tom says: > > Yeah, I too think this is a cast, and truncation is the spec-defined > > behavior for casting to varchar with a specific length limit. I see > > little reason that this should work differently from > > > > select json_serialize('{"a":1, "a":2}' returning text)::varchar(5); > > json_serialize > > ---------------- > > {"a": > > (1 row) > > if i understand it correctly, and my english interpretation is fine. > i think tom means something like: > > select json_serialize('{"a":1, "a":2}' returning text)::varchar(5) = > json_serialize('{"a":1, "a":2}' returning varchar(5)); > > should return true. > the master will return true, but apply your patch, the above query > will yield an error. The RETURNING variant giving an error is what the standard asks us to do apparently. I read Tom's last message on this thread as agreeing to that, even though hesitantly. He can correct me if I got that wrong. > your patch will make domain and char(n) behavior inconsistent. > create domain char2 as char(2); > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR); > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR); > > > another example: > SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes > default '"aaa"'::jsonb ON ERROR); > same value (jsonb "aaa") error on error will yield error, > but `default expression on error` can coerce the value to char(2), > which looks a little bit inconsistent, I think. Interesting examples, thanks for sharing. Attached updated version should take into account that typmod may be hiding under domains. Please test. > ------------------------------------------ > current in ExecInitJsonExpr we have > > if (jsexpr->coercion_expr) > ... > else if (jsexpr->use_json_coercion) > ... > else if (jsexpr->use_io_coercion) > ... > > do you think it is necessary to add following asserts: > Assert (!(jsexpr->coercion_expr && jsexpr->use_json_coercion)) > Assert (!(jsexpr->coercion_expr && jsexpr->use_io_coercion)) Yeah, perhaps, but let's consider this independently please. -- Thanks, Amit Langote
Attachment
On Fri, Jun 21, 2024 at 10:48 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Fri, Jun 21, 2024 at 4:05 PM jian he <jian.universality@gmail.com> wrote: > > hi. > > i am a little confused. > > > > here[1] tom says: > > > Yeah, I too think this is a cast, and truncation is the spec-defined > > > behavior for casting to varchar with a specific length limit. I see > > > little reason that this should work differently from > > > > > > select json_serialize('{"a":1, "a":2}' returning text)::varchar(5); > > > json_serialize > > > ---------------- > > > {"a": > > > (1 row) > > > > if i understand it correctly, and my english interpretation is fine. > > i think tom means something like: > > > > select json_serialize('{"a":1, "a":2}' returning text)::varchar(5) = > > json_serialize('{"a":1, "a":2}' returning varchar(5)); > > > > should return true. > > the master will return true, but apply your patch, the above query > > will yield an error. > > The RETURNING variant giving an error is what the standard asks us to > do apparently. I read Tom's last message on this thread as agreeing > to that, even though hesitantly. He can correct me if I got that > wrong. > > > your patch will make domain and char(n) behavior inconsistent. > > create domain char2 as char(2); > > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR); > > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR); > > > > > > another example: > > SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes > > default '"aaa"'::jsonb ON ERROR); > > same value (jsonb "aaa") error on error will yield error, > > but `default expression on error` can coerce the value to char(2), > > which looks a little bit inconsistent, I think. > > Interesting examples, thanks for sharing. > > Attached updated version should take into account that typmod may be > hiding under domains. Please test. I'd like to push this one tomorrow, barring objections. I could use some advice on backpatching. As I mentioned upthread, this changes the behavior for JSON_OBJECT(), JSON_ARRAY(), JSON_ARRAYAGG(), JSON_OBJECTAGG() too, which were added in v16. Should this change be backpatched? In general, what's our stance on changes that cater to improving standard compliance, but are not necessarily bugs. -- Thanks, Amit Langote
On Wed, Jun 26, 2024 at 8:39 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > > > The RETURNING variant giving an error is what the standard asks us to > > do apparently. I read Tom's last message on this thread as agreeing > > to that, even though hesitantly. He can correct me if I got that > > wrong. > > > > > your patch will make domain and char(n) behavior inconsistent. > > > create domain char2 as char(2); > > > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR); > > > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR); > > > > > > > > > another example: > > > SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes > > > default '"aaa"'::jsonb ON ERROR); > > > same value (jsonb "aaa") error on error will yield error, > > > but `default expression on error` can coerce the value to char(2), > > > which looks a little bit inconsistent, I think. > > > > Interesting examples, thanks for sharing. > > > > Attached updated version should take into account that typmod may be > > hiding under domains. Please test. > SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default '13' on error); return ERROR: value too long for type character(2) should return 13 I found out the source of the problem is in coerceJsonExprOutput /* * Use cast expression for domain types; we need CoerceToDomain here. */ if (get_typtype(returning->typid) != TYPTYPE_DOMAIN) { jsexpr->use_io_coercion = true; return; } > > I'd like to push this one tomorrow, barring objections. > Currently the latest patch available cannot be `git apply` cleanly. @@ -464,3 +466,9 @@ SELECT JSON_QUERY(jsonb 'null', '$xyz' PASSING 1 AS xyz); SELECT JSON_EXISTS(jsonb '1', '$' DEFAULT 1 ON ERROR); SELECT JSON_VALUE(jsonb '1', '$' EMPTY ON ERROR); SELECT JSON_QUERY(jsonb '1', '$' TRUE ON ERROR); + +-- Test implicit coercion domain over fixed-legth type specified in RETURNING +CREATE DOMAIN queryfuncs_char2 AS char(2) CHECK (VALUE NOT IN ('12')); +SELECT JSON_QUERY(jsonb '123', '$' RETURNING queryfuncs_char2 ERROR ON ERROR); +SELECT JSON_VALUE(jsonb '123', '$' RETURNING queryfuncs_char2 ERROR ON ERROR); +SELECT JSON_VALUE(jsonb '12', '$' RETURNING queryfuncs_char2 ERROR ON ERROR); cannot found `SELECT JSON_QUERY(jsonb '1', '$' TRUE ON ERROR);` in https://git.postgresql.org/cgit/postgresql.git/tree/src/test/regress/sql/sqljson_queryfuncs.sql
hi. I have assembled a list of simple examples, some works (for comparison sake), most not work as intended. CREATE DOMAIN queryfuncs_char2 AS char(2) CHECK (VALUE NOT IN ('12')); CREATE DOMAIN queryfuncs_d_interval AS interval(2) CHECK (VALUE is not null); SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default '12' on error); SELECT JSON_VALUE(jsonb '12', '$' RETURNING queryfuncs_char2 default '11' on error); SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default '13' on error); SELECT JSON_VALUE(jsonb '"111"', '$' RETURNING queryfuncs_char2 default '17' on error); SELECT JSON_QUERY(jsonb '111', '$' RETURNING queryfuncs_char2 default '14' on error); SELECT JSON_QUERY(jsonb '111', '$' RETURNING queryfuncs_char2 omit quotes default '15' on error); SELECT JSON_QUERY(jsonb '111', '$' RETURNING queryfuncs_char2 keep quotes default '16' on error); SELECT JSON_VALUE(jsonb '"01:23:45.6789"', '$' RETURNING queryfuncs_d_interval default '01:23:45.6789' on error); SELECT JSON_VALUE(jsonb '"01:23:45.6789"', '$' RETURNING queryfuncs_d_interval default '01:23:45.6789' on empty); SELECT JSON_QUERY(jsonb '"01:23:45.6789"', '$' RETURNING queryfuncs_d_interval default '01:23:45.6789' on error); SELECT JSON_QUERY(jsonb '"01:23:45.6789"', '$' RETURNING queryfuncs_d_interval default '01:23:45.6789' on empty); above 4 queries fails, meaning the changes you propose within transformJsonBehavior is wrong? i think it's because the COERCION_IMPLICIT cast from text to domain queryfuncs_d_interval is not doable. json_table seems also have problem with "exists" cast to other type, example: SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a char(2) EXISTS PATH '$.a' )); SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a queryfuncs_char2 EXISTS PATH '$.a')); SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a queryfuncs_char2 EXISTS PATH '$.a' error on error)); SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a queryfuncs_char2 EXISTS PATH '$.a' error on empty)); ---------------------------------------------------------------------------------------------------- SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default '13' on error); for the above example: coerceJsonExprOutput, coerceJsonFuncExpr set the result datum coercion node to RelabelType: RelabelType is not error safe. so the above query will fail converting text 111 to queryfuncs_char2 which is not what we want. I think making coerceJsonExprOutput the following way can solve this problem. your patch cannot apply cleanly, I just posted the actual code snippet of coerceJsonExprOutput, not a diff file. static void coerceJsonExprOutput(ParseState *pstate, JsonExpr *jsexpr) { JsonReturning *returning = jsexpr->returning; Node *context_item = jsexpr->formatted_expr; int default_typmod; Oid default_typid; bool omit_quotes = jsexpr->op == JSON_QUERY_OP && jsexpr->omit_quotes; Node *coercion_expr = NULL; int32 baseTypmod = returning->typmod; Assert(returning); /* * Check for cases where the coercion should be handled at runtime, that * is, without using a cast expression. */ if (jsexpr->op == JSON_VALUE_OP) { /* * Use cast expression for domain types; we need CoerceToDomain here. */ if (get_typtype(returning->typid) != TYPTYPE_DOMAIN) { jsexpr->use_io_coercion = true; return; } else { /* domain type, typmod > 0 can only use use_io_coercion */ (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod); if (baseTypmod > 0) { jsexpr->use_io_coercion = true; return; } } } else if (jsexpr->op == JSON_QUERY_OP) { /* * Cast functions from jsonb to the following types (jsonb_bool() et * al) don't handle errors softly, so coerce either by calling * json_populate_type() or the type's input function so that any * errors are handled appropriately. The latter only if OMIT QUOTES is * true. */ switch (returning->typid) { case BOOLOID: case NUMERICOID: case INT2OID: case INT4OID: case INT8OID: case FLOAT4OID: case FLOAT8OID: if (jsexpr->omit_quotes) jsexpr->use_io_coercion = true; else jsexpr->use_json_coercion = true; return; default: break; } /* * for returning domain type, we cannot use coercion expression. * it may not be able to catch the error, for example RelabelType * for we either use_io_coercion or use_json_coercion. */ if (get_typtype(returning->typid) == TYPTYPE_DOMAIN) (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod); /* * coerceJsonFuncExpr() creates implicit casts for types with typmod, * which (if present) don't handle errors softly, so use runtime * coercion. */ if (baseTypmod > 0) { if (jsexpr->omit_quotes) jsexpr->use_io_coercion = true; else jsexpr->use_json_coercion = true; return; } } ... -------------------------------
Hi, On Wed, Jun 26, 2024 at 11:46 PM jian he <jian.universality@gmail.com> wrote: > On Wed, Jun 26, 2024 at 8:39 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > > > > > > The RETURNING variant giving an error is what the standard asks us to > > > do apparently. I read Tom's last message on this thread as agreeing > > > to that, even though hesitantly. He can correct me if I got that > > > wrong. > > > > > > > your patch will make domain and char(n) behavior inconsistent. > > > > create domain char2 as char(2); > > > > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR); > > > > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR); > > > > > > > > > > > > another example: > > > > SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes > > > > default '"aaa"'::jsonb ON ERROR); > > > > same value (jsonb "aaa") error on error will yield error, > > > > but `default expression on error` can coerce the value to char(2), > > > > which looks a little bit inconsistent, I think. > > > > > > Interesting examples, thanks for sharing. > > > > > > Attached updated version should take into account that typmod may be > > > hiding under domains. Please test. > > > > SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default > '13' on error); > return > ERROR: value too long for type character(2) > should return 13 > > I found out the source of the problem is in coerceJsonExprOutput > /* > * Use cast expression for domain types; we need CoerceToDomain here. > */ > if (get_typtype(returning->typid) != TYPTYPE_DOMAIN) > { > jsexpr->use_io_coercion = true; > return; > } Thanks for this test case and the analysis. Yes, using a cast expression for coercion to the RETURNING type generally seems to be a source of many problems that could've been solved by fixing things so that only use_io_coercion and use_json_coercion are enough to handle all the cases. I've attempted that in the attached 0001, which removes JsonExpr.coercion_expr and a bunch of code around it. 0002 is now the original patch minus the changes to make JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE() behave as we would like, because the changes in 0001 covers them. The changes for JsonBehavior expression coercion as they were in the last version of the patch are still needed, but I decided to move those into 0001 so that the changes for query functions are all in 0001 and those for constructors in 0002. It would be nice to get rid of that coerce_to_target_type() call to coerce the "behavior expression" to RETURNING type, but I'm leaving that as a task for another day. -- Thanks, Amit Langote
Attachment
On Thu, Jun 27, 2024 at 6:57 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Wed, Jun 26, 2024 at 11:46 PM jian he <jian.universality@gmail.com> wrote: > > On Wed, Jun 26, 2024 at 8:39 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > > > > > > > > > The RETURNING variant giving an error is what the standard asks us to > > > > do apparently. I read Tom's last message on this thread as agreeing > > > > to that, even though hesitantly. He can correct me if I got that > > > > wrong. > > > > > > > > > your patch will make domain and char(n) behavior inconsistent. > > > > > create domain char2 as char(2); > > > > > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR); > > > > > SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR); > > > > > > > > > > > > > > > another example: > > > > > SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes > > > > > default '"aaa"'::jsonb ON ERROR); > > > > > same value (jsonb "aaa") error on error will yield error, > > > > > but `default expression on error` can coerce the value to char(2), > > > > > which looks a little bit inconsistent, I think. > > > > > > > > Interesting examples, thanks for sharing. > > > > > > > > Attached updated version should take into account that typmod may be > > > > hiding under domains. Please test. > > > > > > > SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default > > '13' on error); > > return > > ERROR: value too long for type character(2) > > should return 13 > > > > I found out the source of the problem is in coerceJsonExprOutput > > /* > > * Use cast expression for domain types; we need CoerceToDomain here. > > */ > > if (get_typtype(returning->typid) != TYPTYPE_DOMAIN) > > { > > jsexpr->use_io_coercion = true; > > return; > > } > > Thanks for this test case and the analysis. Yes, using a cast > expression for coercion to the RETURNING type generally seems to be a > source of many problems that could've been solved by fixing things so > that only use_io_coercion and use_json_coercion are enough to handle > all the cases. > > I've attempted that in the attached 0001, which removes > JsonExpr.coercion_expr and a bunch of code around it. > > 0002 is now the original patch minus the changes to make > JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE() behave as we would like, > because the changes in 0001 covers them. The changes for JsonBehavior > expression coercion as they were in the last version of the patch are > still needed, but I decided to move those into 0001 so that the > changes for query functions are all in 0001 and those for constructors > in 0002. It would be nice to get rid of that coerce_to_target_type() > call to coerce the "behavior expression" to RETURNING type, but I'm > leaving that as a task for another day. Updated 0001 to remove outdated references, remove some more unnecessary code. -- Thanks, Amit Langote
Attachment
On Thu, Jun 27, 2024 at 7:48 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > > I've attempted that in the attached 0001, which removes > > JsonExpr.coercion_expr and a bunch of code around it. > > > > 0002 is now the original patch minus the changes to make > > JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE() behave as we would like, > > because the changes in 0001 covers them. The changes for JsonBehavior > > expression coercion as they were in the last version of the patch are > > still needed, but I decided to move those into 0001 so that the > > changes for query functions are all in 0001 and those for constructors > > in 0002. It would be nice to get rid of that coerce_to_target_type() > > call to coerce the "behavior expression" to RETURNING type, but I'm > > leaving that as a task for another day. > > Updated 0001 to remove outdated references, remove some more unnecessary code. > i found some remaining references of "coercion_expr" should be removed. src/include/nodes/primnodes.h /* JsonExpr's collation, if coercion_expr is NULL. */ src/include/nodes/execnodes.h /* * Address of the step to coerce the result value of jsonpath evaluation * to the RETURNING type using JsonExpr.coercion_expr. -1 if no coercion * is necessary or if either JsonExpr.use_io_coercion or * JsonExpr.use_json_coercion is true. */ int jump_eval_coercion; src/backend/jit/llvm/llvmjit_expr.c /* coercion_expr code */ LLVMPositionBuilderAtEnd(b, b_coercion); if (jsestate->jump_eval_coercion >= 0) LLVMBuildBr(b, opblocks[jsestate->jump_eval_coercion]); else LLVMBuildUnreachable(b); src/backend/executor/execExprInterp.c /* * Checks if an error occurred either when evaluating JsonExpr.coercion_expr or * in ExecEvalJsonCoercion(). If so, this sets JsonExprState.error to trigger * the ON ERROR handling steps. */ void ExecEvalJsonCoercionFinish(ExprState *state, ExprEvalStep *op) { } if (jbv == NULL) { /* Will be coerced with coercion_expr, if any. */ *op->resvalue = (Datum) 0; *op->resnull = true; } src/backend/executor/execExpr.c /* * Jump to coerce the NULL using coercion_expr if present. Coercing NULL * is only interesting when the RETURNING type is a domain whose * constraints must be checked. jsexpr->coercion_expr containing a * CoerceToDomain node must have been set in that case. */ /* * Jump to coerce the NULL using coercion_expr if present. Coercing NULL * is only interesting when the RETURNING type is a domain whose * constraints must be checked. jsexpr->coercion_expr containing a * CoerceToDomain node must have been set in that case. */
On Fri, Jun 28, 2024 at 3:14 PM jian he <jian.universality@gmail.com> wrote: > On Thu, Jun 27, 2024 at 7:48 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > > > > I've attempted that in the attached 0001, which removes > > > JsonExpr.coercion_expr and a bunch of code around it. > > > > > > 0002 is now the original patch minus the changes to make > > > JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE() behave as we would like, > > > because the changes in 0001 covers them. The changes for JsonBehavior > > > expression coercion as they were in the last version of the patch are > > > still needed, but I decided to move those into 0001 so that the > > > changes for query functions are all in 0001 and those for constructors > > > in 0002. It would be nice to get rid of that coerce_to_target_type() > > > call to coerce the "behavior expression" to RETURNING type, but I'm > > > leaving that as a task for another day. > > > > Updated 0001 to remove outdated references, remove some more unnecessary code. > > > > i found some remaining references of "coercion_expr" should be removed. > > src/include/nodes/primnodes.h > /* JsonExpr's collation, if coercion_expr is NULL. */ > > > src/include/nodes/execnodes.h > /* > * Address of the step to coerce the result value of jsonpath evaluation > * to the RETURNING type using JsonExpr.coercion_expr. -1 if no coercion > * is necessary or if either JsonExpr.use_io_coercion or > * JsonExpr.use_json_coercion is true. > */ > int jump_eval_coercion; > > src/backend/jit/llvm/llvmjit_expr.c > /* coercion_expr code */ > LLVMPositionBuilderAtEnd(b, b_coercion); > if (jsestate->jump_eval_coercion >= 0) > LLVMBuildBr(b, opblocks[jsestate->jump_eval_coercion]); > else > LLVMBuildUnreachable(b); > > > src/backend/executor/execExprInterp.c > /* > * Checks if an error occurred either when evaluating JsonExpr.coercion_expr or > * in ExecEvalJsonCoercion(). If so, this sets JsonExprState.error to trigger > * the ON ERROR handling steps. > */ > void > ExecEvalJsonCoercionFinish(ExprState *state, ExprEvalStep *op) > { > } > > if (jbv == NULL) > { > /* Will be coerced with coercion_expr, if any. */ > *op->resvalue = (Datum) 0; > *op->resnull = true; > } > > > src/backend/executor/execExpr.c > /* > * Jump to coerce the NULL using coercion_expr if present. Coercing NULL > * is only interesting when the RETURNING type is a domain whose > * constraints must be checked. jsexpr->coercion_expr containing a > * CoerceToDomain node must have been set in that case. > */ > > /* > * Jump to coerce the NULL using coercion_expr if present. Coercing NULL > * is only interesting when the RETURNING type is a domain whose > * constraints must be checked. jsexpr->coercion_expr containing a > * CoerceToDomain node must have been set in that case. > */ Thanks for checking. Will push the attached 0001 and 0002 shortly. -- Thanks, Amit Langote
Attachment
> diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c > index 233b7b1cc9..df766cdec1 100644 > --- a/src/backend/parser/parse_expr.c > +++ b/src/backend/parser/parse_expr.c > @@ -3583,6 +3583,7 @@ coerceJsonFuncExpr(ParseState *pstate, Node *expr, > Node *res; > int location; > Oid exprtype = exprType(expr); > + int32 baseTypmod = returning->typmod; > > /* if output type is not specified or equals to function type, return */ > if (!OidIsValid(returning->typid) || returning->typid == exprtype) > @@ -3611,10 +3612,19 @@ coerceJsonFuncExpr(ParseState *pstate, Node *expr, > return (Node *) fexpr; > } > > + /* > + * For domains, consider the base type's typmod to decide whether to setup > + * an implicit or explicit cast. > + */ > + if (get_typtype(returning->typid) == TYPTYPE_DOMAIN) > + (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod); I didn't review this patch in detail, but I just noticed this tiny bit and wanted to say that I don't like this coding style where you initialize a variable to a certain value, and much later you override it with a completely different value. It seems much clearer to leave it uninitialized at first, and have both places that determine the value together, if (get_typtype(returning->typid) == TYPTYPE_DOMAIN) (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod); else baseTypmod = returning->typmod; Not only because in the domain case the initializer value is a downright lie, but also because of considerations such as if you later add code that uses the variable in between those two places, you'd be introducing a bug in the domain case because it hasn't been set. With the coding I propose, the compiler immediately tells you that the initialization is missing. TBH I'm not super clear on why we decide on explicit or implicit cast based on presence of a typmod. Why isn't it better to always use an implicit one? -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "Small aircraft do not crash frequently ... usually only once!" (ponder, http://thedailywtf.com/)
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: >> + /* >> + * For domains, consider the base type's typmod to decide whether to setup >> + * an implicit or explicit cast. >> + */ >> + if (get_typtype(returning->typid) == TYPTYPE_DOMAIN) >> + (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod); > TBH I'm not super clear on why we decide on explicit or implicit cast > based on presence of a typmod. Why isn't it better to always use an > implicit one? Hmm ... there are a bunch of existing places that seem to have similar logic, but they are all in new-ish SQL/JSON functionality, and I would not be surprised if they are all wrong. parse_coerce.c is quite opinionated about what a domain's typtypmod means (see comments in coerce_type() for instance); see also the logic in coerce_to_domain: * If the domain applies a typmod to its base type, build the appropriate * coercion step. Mark it implicit for display purposes, because we don't * want it shown separately by ruleutils.c; but the isExplicit flag passed * to the conversion function depends on the manner in which the domain * coercion is invoked, so that the semantics of implicit and explicit * coercion differ. (Is that really the behavior we want?) I don't think that this SQL/JSON behavior quite matches that. While I'm bitching ... this coding style is bogus anyway: >> + if (get_typtype(returning->typid) == TYPTYPE_DOMAIN) >> + (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod); because it results in two syscache lookups not one. You are supposed to apply getBaseTypeAndTypmod unconditionally, as is done everywhere except in the SQL/JSON logic. I am also wondering how it can possibly be sensible to throw away the function result of getBaseTypeAndTypmod in this context. regards, tom lane
On Sun, Jun 30, 2024 at 2:24 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > TBH I'm not super clear on why we decide on explicit or implicit cast > based on presence of a typmod. Why isn't it better to always use an > implicit one? > I am using an example to explain it. SELECT JSON_SERIALIZE(JSON('{ "a" : 1 } ')); we cannot directly use implicit cast from json to text in {coerceJsonFuncExpr, coerce_to_target_type} because function calls: coerceJsonFuncExpr->coerce_to_target_type->can_coerce_type ->find_coercion_pathway will look up pg_cast entries. but we don't have text & json implicit cast entries, we will fail at: ```` if (!res && report_error) ereport(ERROR, errcode(ERRCODE_CANNOT_COERCE), errmsg("cannot cast type %s to %s", format_type_be(exprtype), format_type_be(returning->typid)), parser_coercion_errposition(pstate, location, expr)); ```` Most of the cast uses explicit cast, which is what we previously did, then in this thread, we found out for the returning type typmod( (varchar, or varchar's domain) We need to first cast the expression to text then text to varchar via implicit cast. To trap the error: for example: SELECT JSON_SERIALIZE('{ "a" : 1 } ' RETURNING varchar(2); also see the comment: https://git.postgresql.org/cgit/postgresql.git/commit/?id=c2d93c3802b205d135d1ae1d7ac167d74e08a274 + /* + * Convert the source expression to text, because coerceJsonFuncExpr() + * will create an implicit cast to the RETURNING types with typmod and + * there are no implicit casts from json(b) to such types. For domains, + * the base type's typmod will be considered, so do so here too. + */ In general, I think implicit cast here is an exception. overall I come up with following logic: ----------------- int32 baseTypmod = -1; if (returning->typmod < 0) (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod); else baseTypmod = returning->typmod; res = coerce_to_target_type(pstate, expr, exprtype, returning->typid, baseTypmod, baseTypmod > 0 ? COERCION_IMPLICIT : COERCION_EXPLICIT, baseTypmod > 0 ? COERCE_IMPLICIT_CAST : COERCE_EXPLICIT_CAST, location); ----------------- By the same way we are dealing with varchar, I came up with a verbose patch for transformJsonBehavior, which can cope with all the corner cases of bit and varbit data type. I also attached a test sql file (scratch169.sql) for it. some examples: --fail SELECT JSON_VALUE(jsonb '"111a"', '$' RETURNING bit(3) default '1111' on error); --ok SELECT JSON_VALUE(jsonb '"111a"', '$' RETURNING bit(3) default '111' on error); --ok SELECT JSON_VALUE(jsonb '"111a"', '$' RETURNING bit(3) default 32 on error); makeJsonConstructorExpr we called (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod); later in coerceJsonFuncExpr we may also call (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod); maybe we can do some refactoring.
Attachment
On Sun, Jun 30, 2024 at 3:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > >> + /* > >> + * For domains, consider the base type's typmod to decide whether to setup > >> + * an implicit or explicit cast. > >> + */ > >> + if (get_typtype(returning->typid) == TYPTYPE_DOMAIN) > >> + (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod); > > > TBH I'm not super clear on why we decide on explicit or implicit cast > > based on presence of a typmod. Why isn't it better to always use an > > implicit one? > > Hmm ... there are a bunch of existing places that seem to have similar > logic, but they are all in new-ish SQL/JSON functionality, and I would > not be surprised if they are all wrong. parse_coerce.c is quite > opinionated about what a domain's typtypmod means (see comments in > coerce_type() for instance); see also the logic in coerce_to_domain: > > * If the domain applies a typmod to its base type, build the appropriate > * coercion step. Mark it implicit for display purposes, because we don't > * want it shown separately by ruleutils.c; but the isExplicit flag passed > * to the conversion function depends on the manner in which the domain > * coercion is invoked, so that the semantics of implicit and explicit > * coercion differ. (Is that really the behavior we want?) > > I don't think that this SQL/JSON behavior quite matches that. The reason I decided to go for the implicit cast only when there is a typmod is that the behavior with COERCION_EXPLICIT is only problematic when there's a typmod because of this code in build_coercion_expression: if (nargs == 3) { /* Pass it a boolean isExplicit parameter, too */ cons = makeConst(BOOLOID, -1, InvalidOid, sizeof(bool), BoolGetDatum(ccontext == COERCION_EXPLICIT), false, true); args = lappend(args, cons); } Yeah, we could have fixed that by always using COERCION_IMPLICIT for SQL/JSON but, as Jian said, we don't have a bunch of casts that these SQL/JSON functions need, which is why I guess we ended up with COERCION_EXPLICIT here in the first place. One option I hadn't tried was using COERCION_ASSIGNMENT instead, which seems to give coerceJsonFuncExpr() the casts it needs with the behavior it wants, so how about applying the attached? -- Thanks, Amit Langote
Attachment
On Mon, Jul 1, 2024 at 6:45 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Sun, Jun 30, 2024 at 3:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > >> + /* > > >> + * For domains, consider the base type's typmod to decide whether to setup > > >> + * an implicit or explicit cast. > > >> + */ > > >> + if (get_typtype(returning->typid) == TYPTYPE_DOMAIN) > > >> + (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod); > > > > > TBH I'm not super clear on why we decide on explicit or implicit cast > > > based on presence of a typmod. Why isn't it better to always use an > > > implicit one? > > > > Hmm ... there are a bunch of existing places that seem to have similar > > logic, but they are all in new-ish SQL/JSON functionality, and I would > > not be surprised if they are all wrong. parse_coerce.c is quite > > opinionated about what a domain's typtypmod means (see comments in > > coerce_type() for instance); see also the logic in coerce_to_domain: > > > > * If the domain applies a typmod to its base type, build the appropriate > > * coercion step. Mark it implicit for display purposes, because we don't > > * want it shown separately by ruleutils.c; but the isExplicit flag passed > > * to the conversion function depends on the manner in which the domain > > * coercion is invoked, so that the semantics of implicit and explicit > > * coercion differ. (Is that really the behavior we want?) > > > > I don't think that this SQL/JSON behavior quite matches that. > > The reason I decided to go for the implicit cast only when there is a > typmod is that the behavior with COERCION_EXPLICIT is only problematic > when there's a typmod because of this code in > build_coercion_expression: > > if (nargs == 3) > { > /* Pass it a boolean isExplicit parameter, too */ > cons = makeConst(BOOLOID, > -1, > InvalidOid, > sizeof(bool), > BoolGetDatum(ccontext == COERCION_EXPLICIT), > false, > true); > > args = lappend(args, cons); > } > > Yeah, we could have fixed that by always using COERCION_IMPLICIT for > SQL/JSON but, as Jian said, we don't have a bunch of casts that these > SQL/JSON functions need, which is why I guess we ended up with > COERCION_EXPLICIT here in the first place. > > One option I hadn't tried was using COERCION_ASSIGNMENT instead, which > seems to give coerceJsonFuncExpr() the casts it needs with the > behavior it wants, so how about applying the attached? you patched works. i think it's because of you mentioned build_coercion_expression ` if (nargs == 3)` related code and find_coercion_pathway: if (result == COERCION_PATH_NONE) { if (ccontext >= COERCION_ASSIGNMENT && TypeCategory(targetTypeId) == TYPCATEGORY_STRING) result = COERCION_PATH_COERCEVIAIO; else if (ccontext >= COERCION_EXPLICIT && TypeCategory(sourceTypeId) == TYPCATEGORY_STRING) result = COERCION_PATH_COERCEVIAIO; } functions: JSON_OBJECT,JSON_ARRAY, JSON_ARRAYAGG,JSON_OBJECTAGG, JSON_SERIALIZE the returning type can only be string type or json. json type already being handled in other code. so the targetTypeId category will be only TYPCATEGORY_STRING.
On Tue, Jul 2, 2024 at 3:19 PM jian he <jian.universality@gmail.com> wrote: > On Mon, Jul 1, 2024 at 6:45 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > > On Sun, Jun 30, 2024 at 3:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > > >> + /* > > > >> + * For domains, consider the base type's typmod to decide whether to setup > > > >> + * an implicit or explicit cast. > > > >> + */ > > > >> + if (get_typtype(returning->typid) == TYPTYPE_DOMAIN) > > > >> + (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod); > > > > > > > TBH I'm not super clear on why we decide on explicit or implicit cast > > > > based on presence of a typmod. Why isn't it better to always use an > > > > implicit one? > > > > > > Hmm ... there are a bunch of existing places that seem to have similar > > > logic, but they are all in new-ish SQL/JSON functionality, and I would > > > not be surprised if they are all wrong. parse_coerce.c is quite > > > opinionated about what a domain's typtypmod means (see comments in > > > coerce_type() for instance); see also the logic in coerce_to_domain: > > > > > > * If the domain applies a typmod to its base type, build the appropriate > > > * coercion step. Mark it implicit for display purposes, because we don't > > > * want it shown separately by ruleutils.c; but the isExplicit flag passed > > > * to the conversion function depends on the manner in which the domain > > > * coercion is invoked, so that the semantics of implicit and explicit > > > * coercion differ. (Is that really the behavior we want?) > > > > > > I don't think that this SQL/JSON behavior quite matches that. > > > > The reason I decided to go for the implicit cast only when there is a > > typmod is that the behavior with COERCION_EXPLICIT is only problematic > > when there's a typmod because of this code in > > build_coercion_expression: > > > > if (nargs == 3) > > { > > /* Pass it a boolean isExplicit parameter, too */ > > cons = makeConst(BOOLOID, > > -1, > > InvalidOid, > > sizeof(bool), > > BoolGetDatum(ccontext == COERCION_EXPLICIT), > > false, > > true); > > > > args = lappend(args, cons); > > } > > > > Yeah, we could have fixed that by always using COERCION_IMPLICIT for > > SQL/JSON but, as Jian said, we don't have a bunch of casts that these > > SQL/JSON functions need, which is why I guess we ended up with > > COERCION_EXPLICIT here in the first place. > > > > One option I hadn't tried was using COERCION_ASSIGNMENT instead, which > > seems to give coerceJsonFuncExpr() the casts it needs with the > > behavior it wants, so how about applying the attached? > > you patched works. > i think it's because of you mentioned build_coercion_expression ` if > (nargs == 3)` related code > and > > find_coercion_pathway: > if (result == COERCION_PATH_NONE) > { > if (ccontext >= COERCION_ASSIGNMENT && > TypeCategory(targetTypeId) == TYPCATEGORY_STRING) > result = COERCION_PATH_COERCEVIAIO; > else if (ccontext >= COERCION_EXPLICIT && > TypeCategory(sourceTypeId) == TYPCATEGORY_STRING) > result = COERCION_PATH_COERCEVIAIO; > } > > functions: JSON_OBJECT,JSON_ARRAY, JSON_ARRAYAGG,JSON_OBJECTAGG, > JSON_SERIALIZE > the returning type can only be string type or json. json type already > being handled in other code. > so the targetTypeId category will be only TYPCATEGORY_STRING. Yes, thanks for confirming that. I checked other sites that use COERCION_ASSIGNMENT and I don't see a reason why it can't be used in this context. I'll push the patch tomorrow unless there are objections. -- Thanks, Amit Langote
On Tue, Jul 2, 2024 at 5:03 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Tue, Jul 2, 2024 at 3:19 PM jian he <jian.universality@gmail.com> wrote: > > On Mon, Jul 1, 2024 at 6:45 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > > > > On Sun, Jun 30, 2024 at 3:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > > > >> + /* > > > > >> + * For domains, consider the base type's typmod to decide whether to setup > > > > >> + * an implicit or explicit cast. > > > > >> + */ > > > > >> + if (get_typtype(returning->typid) == TYPTYPE_DOMAIN) > > > > >> + (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod); > > > > > > > > > TBH I'm not super clear on why we decide on explicit or implicit cast > > > > > based on presence of a typmod. Why isn't it better to always use an > > > > > implicit one? > > > > > > > > Hmm ... there are a bunch of existing places that seem to have similar > > > > logic, but they are all in new-ish SQL/JSON functionality, and I would > > > > not be surprised if they are all wrong. parse_coerce.c is quite > > > > opinionated about what a domain's typtypmod means (see comments in > > > > coerce_type() for instance); see also the logic in coerce_to_domain: > > > > > > > > * If the domain applies a typmod to its base type, build the appropriate > > > > * coercion step. Mark it implicit for display purposes, because we don't > > > > * want it shown separately by ruleutils.c; but the isExplicit flag passed > > > > * to the conversion function depends on the manner in which the domain > > > > * coercion is invoked, so that the semantics of implicit and explicit > > > > * coercion differ. (Is that really the behavior we want?) > > > > > > > > I don't think that this SQL/JSON behavior quite matches that. > > > > > > The reason I decided to go for the implicit cast only when there is a > > > typmod is that the behavior with COERCION_EXPLICIT is only problematic > > > when there's a typmod because of this code in > > > build_coercion_expression: > > > > > > if (nargs == 3) > > > { > > > /* Pass it a boolean isExplicit parameter, too */ > > > cons = makeConst(BOOLOID, > > > -1, > > > InvalidOid, > > > sizeof(bool), > > > BoolGetDatum(ccontext == COERCION_EXPLICIT), > > > false, > > > true); > > > > > > args = lappend(args, cons); > > > } > > > > > > Yeah, we could have fixed that by always using COERCION_IMPLICIT for > > > SQL/JSON but, as Jian said, we don't have a bunch of casts that these > > > SQL/JSON functions need, which is why I guess we ended up with > > > COERCION_EXPLICIT here in the first place. > > > > > > One option I hadn't tried was using COERCION_ASSIGNMENT instead, which > > > seems to give coerceJsonFuncExpr() the casts it needs with the > > > behavior it wants, so how about applying the attached? > > > > you patched works. > > i think it's because of you mentioned build_coercion_expression ` if > > (nargs == 3)` related code > > and > > > > find_coercion_pathway: > > if (result == COERCION_PATH_NONE) > > { > > if (ccontext >= COERCION_ASSIGNMENT && > > TypeCategory(targetTypeId) == TYPCATEGORY_STRING) > > result = COERCION_PATH_COERCEVIAIO; > > else if (ccontext >= COERCION_EXPLICIT && > > TypeCategory(sourceTypeId) == TYPCATEGORY_STRING) > > result = COERCION_PATH_COERCEVIAIO; > > } > > > > functions: JSON_OBJECT,JSON_ARRAY, JSON_ARRAYAGG,JSON_OBJECTAGG, > > JSON_SERIALIZE > > the returning type can only be string type or json. json type already > > being handled in other code. > > so the targetTypeId category will be only TYPCATEGORY_STRING. > > Yes, thanks for confirming that. > > I checked other sites that use COERCION_ASSIGNMENT and I don't see a > reason why it can't be used in this context. > > I'll push the patch tomorrow unless there are objections. Sorry, I dropped the ball on this one. I've pushed the patch now. -- Thanks, Amit Langote
we still have problem in transformJsonBehavior currently transformJsonBehavior: SELECT JSON_VALUE(jsonb '1234', '$' RETURNING bit(3) DEFAULT 010111 ON ERROR); ERROR: cannot cast behavior expression of type text to bit LINE 1: ...VALUE(jsonb '1234', '$' RETURNING bit(3) DEFAULT 010111 ON ... here, 010111 will default to int4, so "cannot cast behavior expression of type text to bit" is wrong? also int4/int8 can be explicitly cast to bit(3), in this case, it should return 111. Also, do we want to deal with bit data type's typmod like we did for string type in transformJsonBehavior? like: SELECT JSON_VALUE(jsonb '"111"', '$' RETURNING bit(3) default '1111' on error); should return error: ERROR: bit string length 2 does not match type bit(3) or success The attached patch makes it return an error, similar to what we did for the fixed length string type.
Attachment
On Thu, Jul 18, 2024 at 3:04 PM jian he <jian.universality@gmail.com> wrote: > we still have problem in transformJsonBehavior > > currently transformJsonBehavior: > SELECT JSON_VALUE(jsonb '1234', '$' RETURNING bit(3) DEFAULT 010111 ON ERROR); > ERROR: cannot cast behavior expression of type text to bit > LINE 1: ...VALUE(jsonb '1234', '$' RETURNING bit(3) DEFAULT 010111 ON ... > > here, 010111 will default to int4, so "cannot cast behavior expression > of type text to bit" > is wrong? > also int4/int8 can be explicitly cast to bit(3), in this case, it > should return 111. I think we shouldn't try too hard in the code to "automatically" cast the DEFAULT expression, especially if that means having to add special case code for all sorts of source-target-type combinations. I'm inclined to just give a HINT to the user to cast the DEFAULT expression by hand, because they *can* do that with the syntax that exists. On the other hand, transformJsonBehavior() should handle other "internal" expressions for which the cast cannot be specified by hand. > Also, do we want to deal with bit data type's typmod like we did for > string type in transformJsonBehavior? > like: > SELECT JSON_VALUE(jsonb '"111"', '$' RETURNING bit(3) default '1111' on error); > should return error: > ERROR: bit string length 2 does not match type bit(3) > or success > > The attached patch makes it return an error, similar to what we did > for the fixed length string type. Yeah, that makes sense. I'm planning to push the attached 2 patches. 0001 is to fix transformJsonBehavior() for these cases and 0002 to adjust the behavior of casting the result of JSON_EXISTS() and EXISTS columns to integer type. I've included the tests in your patch in 0001. I noticed using cast expression to coerce the boolean constants to fixed-length types would produce unexpected errors when the planner's const-simplification calls the cast functions. So in 0001, I've made that case also use runtime coercion using json_populate_type(). -- Thanks, Amit Langote
Attachment
On Mon, Jul 22, 2024 at 4:46 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Thu, Jul 18, 2024 at 3:04 PM jian he <jian.universality@gmail.com> wrote: > > we still have problem in transformJsonBehavior > > > > currently transformJsonBehavior: > > SELECT JSON_VALUE(jsonb '1234', '$' RETURNING bit(3) DEFAULT 010111 ON ERROR); > > ERROR: cannot cast behavior expression of type text to bit > > LINE 1: ...VALUE(jsonb '1234', '$' RETURNING bit(3) DEFAULT 010111 ON ... > > > > here, 010111 will default to int4, so "cannot cast behavior expression > > of type text to bit" > > is wrong? > > also int4/int8 can be explicitly cast to bit(3), in this case, it > > should return 111. > > I think we shouldn't try too hard in the code to "automatically" cast > the DEFAULT expression, especially if that means having to add special > case code for all sorts of source-target-type combinations. > > I'm inclined to just give a HINT to the user to cast the DEFAULT > expression by hand, because they *can* do that with the syntax that > exists. select typname, typinput, pg_get_function_identity_arguments(typinput) from pg_type pt join pg_proc proc on proc.oid = pt.typinput where typtype = 'b' and typarray <> 0 and proc.pronargs > 1; As you can see from the query result, we only need to deal with bit and character type in this context. SELECT JSON_VALUE(jsonb '1234', '$.a' RETURNING bit(3) DEFAULT 10111 ON empty); SELECT JSON_VALUE(jsonb '1234', '$.a' RETURNING char(3) DEFAULT 10111 ON empty) ; the single quote literal ', no explicit cast, resolve to text type. no single quote like 11, no explicit cast, resolve to int type. we actually can cast int to bit, also have pg_cast entry. so the above these 2 examples should behave the same, given there is no pg_cast entry for int to text. select castsource::regtype ,casttarget::regtype ,castfunc,castcontext, castmethod from pg_cast where 'int'::regtype in (castsource::regtype ,casttarget::regtype); but i won't insist on it, since bit/varbit don't use that much. > > I'm planning to push the attached 2 patches. 0001 is to fix > transformJsonBehavior() for these cases and 0002 to adjust the > behavior of casting the result of JSON_EXISTS() and EXISTS columns to > integer type. I've included the tests in your patch in 0001. I > noticed using cast expression to coerce the boolean constants to > fixed-length types would produce unexpected errors when the planner's > const-simplification calls the cast functions. So in 0001, I've made > that case also use runtime coercion using json_populate_type(). > + <note> + <para> + If an <literal>ON ERROR</literal> or <literal>ON EMPTY</literal> + expression can't be coerced to the <literal>RETURNING</literal> type + successfully, an SQL NULL value will be returned. + </para> + </note> I think this change will have some controversy. the following are counterexamples SELECT JSON_value(jsonb '"aaa"', '$.a' RETURNING bool DEFAULT '"2022-01-01"' ON empty); return error, based on your change, should return NULL? SELECT JSON_QUERY(jsonb '"[3,4]"', '$.a' RETURNING bigint[] EMPTY array ON empty); SELECT JSON_QUERY(jsonb '"[3,4]"', '$.a' RETURNING bigint[] EMPTY object ON empty); Now things get more confusing. empty array, empty object refers to jsonb '[]' and jsonb '{}', both cannot explicitly be cast to bigint[], so both should return NULL based on your new implementation? omit/keep quotes applied when casting '"[1,2]"' to int4range. SELECT JSON_query(jsonb '"aaa"', '$.a' RETURNING int4range omit quotes DEFAULT '"[1,2]"'::jsonb ON empty); SELECT JSON_query(jsonb '"aaa"', '$.a' RETURNING int4range keep quotes DEFAULT '"[1,2]"'::jsonb ON empty); SELECT JSON_value(jsonb '"aaa"', '$.a' RETURNING date DEFAULT '"2022-01-01"'::jsonb ON empty); but jsonb cannot coerce to date., the example "select ('"2022-01-01"'::jsonb)::date; " yields an error. but why does this query still return a date?
While reviewing the patch, I found some inconsistency on json_table EXISTS. --tested based on your patch and master. src4=# SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a jsonb EXISTS PATH '$')); ERROR: cannot cast behavior expression of type boolean to jsonb src4=# SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a jsonb EXISTS PATH '$' error on error)); a ------ true (1 row) Why explicitly "error on error" not report error while not explicitly mentioning it yields an error? "(a jsonb EXISTS PATH '$' error on error)" returns jsonb 'true' imply that no errors happened. so "(a jsonb EXISTS PATH '$')" should not have any errors. but boolean cannot cast to jsonb so for JSON_TABLE, we should reject COLUMNS (a jsonb EXISTS PATH '$' error on error )); COLUMNS (a jsonb EXISTS PATH '$' unknown on error )); at an earlier stage. because json_populate_type will use literal 'true'/'false' cast to jsonb, which will not fail. but JsonPathExists returns are *not* quoted true/false. so rejecting it earlier is better than handling it at ExecEvalJsonExprPath. attached patch trying to solve the problem, changes applied based on your 0001, 0002. after apply attached patch: create domain djsonb as jsonb check(value = 'true'); SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a djsonb EXISTS PATH '$' error on error)); ERROR: cannot cast type boolean to djsonb SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a djsonb EXISTS PATH '$' unknown on error)); ERROR: cannot cast type boolean to djsonb SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a jsonb EXISTS PATH '$')); ERROR: cannot cast type boolean to jsonb i found out a typo in src/test/regress/expected/sqljson_queryfuncs.out, src/test/regress/sql/sqljson_queryfuncs.sql "fixed-legth" should be "fixed-length"
Attachment
On Tue, Jul 23, 2024 at 11:45 AM jian he <jian.universality@gmail.com> wrote: > On Mon, Jul 22, 2024 at 4:46 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > > On Thu, Jul 18, 2024 at 3:04 PM jian he <jian.universality@gmail.com> wrote: > > > we still have problem in transformJsonBehavior > > > > > > currently transformJsonBehavior: > > > SELECT JSON_VALUE(jsonb '1234', '$' RETURNING bit(3) DEFAULT 010111 ON ERROR); > > > ERROR: cannot cast behavior expression of type text to bit > > > LINE 1: ...VALUE(jsonb '1234', '$' RETURNING bit(3) DEFAULT 010111 ON ... > > > > > > here, 010111 will default to int4, so "cannot cast behavior expression > > > of type text to bit" > > > is wrong? > > > also int4/int8 can be explicitly cast to bit(3), in this case, it > > > should return 111. > > > > I think we shouldn't try too hard in the code to "automatically" cast > > the DEFAULT expression, especially if that means having to add special > > case code for all sorts of source-target-type combinations. > > > > I'm inclined to just give a HINT to the user to cast the DEFAULT > > expression by hand, because they *can* do that with the syntax that > > exists. > > select typname, typinput, pg_get_function_identity_arguments(typinput) > from pg_type pt join pg_proc proc on proc.oid = pt.typinput > where typtype = 'b' and typarray <> 0 and proc.pronargs > 1; > > As you can see from the query result, we only need to deal with bit > and character type > in this context. > > SELECT JSON_VALUE(jsonb '1234', '$.a' RETURNING bit(3) DEFAULT 10111 ON empty); > SELECT JSON_VALUE(jsonb '1234', '$.a' RETURNING char(3) DEFAULT 10111 > ON empty) ; > > the single quote literal ', no explicit cast, resolve to text type. > no single quote like 11, no explicit cast, resolve to int type. > we actually can cast int to bit, also have pg_cast entry. > so the above these 2 examples should behave the same, given there is > no pg_cast entry for int to text. > > select castsource::regtype ,casttarget::regtype ,castfunc,castcontext, > castmethod > from pg_cast where 'int'::regtype in (castsource::regtype ,casttarget::regtype); > > but i won't insist on it, since bit/varbit don't use that much. The cast from int to bit that exists in pg_cast is only good for explicit casts, so would truncate user's value instead of flagging it as invalid input, and this whole discussion is about not doing that. With the DEFAULT expression specified or interpreted as a text string, we don't have that problem because we can then use CoerceViaIO as an assignment-level cast, whereby the invalid input *is* flagged as it should, like this: SELECT JSON_VALUE(jsonb '1234', '$' RETURNING bit(3) DEFAULT '11111' ON ERROR); ERROR: bit string length 5 does not match type bit(3) So it seems fair to me to flag it when the user specifies an integer in DEFAULT we can't create a cast expression that does not truncate a value to fit the RETURNING type. > > I'm planning to push the attached 2 patches. 0001 is to fix > > transformJsonBehavior() for these cases and 0002 to adjust the > > behavior of casting the result of JSON_EXISTS() and EXISTS columns to > > integer type. I've included the tests in your patch in 0001. I > > noticed using cast expression to coerce the boolean constants to > > fixed-length types would produce unexpected errors when the planner's > > const-simplification calls the cast functions. So in 0001, I've made > > that case also use runtime coercion using json_populate_type(). > > > > + <note> > + <para> > + If an <literal>ON ERROR</literal> or <literal>ON EMPTY</literal> > + expression can't be coerced to the <literal>RETURNING</literal> type > + successfully, an SQL NULL value will be returned. > + </para> > + </note> > > I think this change will have some controversy. On second thought, I agree. I've made some changes to *throw* the error when the JsonBehavior values fail being coerced to the RETURNING type. Please check the attached. In the attached patch, I've also taken care of the problem mentioned in your latest email -- the solution I've chosen is not to produce the error when ERROR ON ERROR is specified but to use runtime coercion also for the jsonb type or any type that is not integer. Also fixed the typos. Thanks for your attention! -- Thanks, Amit Langote
Attachment
On Tue, Jul 23, 2024 at 8:52 PM Amit Langote <amitlangote09@gmail.com> wrote: > > In the attached patch, I've also taken care of the problem mentioned > in your latest email -- the solution I've chosen is not to produce the > error when ERROR ON ERROR is specified but to use runtime coercion > also for the jsonb type or any type that is not integer. Also fixed > the typos. > > Thanks for your attention! > COLUMNS (col_name jsonb EXISTS PATH 'pah_expression') inconsistency seems resolved. I also tested the domain over jsonb, it works. transformJsonFuncExpr we have: case JSON_QUERY_OP: if (jsexpr->returning->typid != JSONBOID || jsexpr->omit_quotes) jsexpr->use_json_coercion = true; case JSON_VALUE_OP: if (jsexpr->returning->typid != TEXTOID) { if (get_typtype(jsexpr->returning->typid) == TYPTYPE_DOMAIN && DomainHasConstraints(jsexpr->returning->typid)) jsexpr->use_json_coercion = true; else jsexpr->use_io_coercion = true; } JSONBOID won't be a domain. for domain type, json_value, json_query will use jsexpr->use_json_coercion. jsexpr->use_json_coercion can handle whether the domain has constraints or not. so i don't know the purpose of following code in ExecInitJsonExpr if (get_typtype(jsexpr->returning->typid) == TYPTYPE_DOMAIN && DomainHasConstraints(jsexpr->returning->typid)) { Assert(jsexpr->use_json_coercion); scratch->opcode = EEOP_JUMP; scratch->d.jump.jumpdone = state->steps_len + 1; ExprEvalPushStep(state, scratch); } json_table exits works fine with int4, not domain over int4. The following are test suites. drop domain if exists dint4, dint4_1,dint4_0; create domain dint4 as int; create domain dint4_1 as int check ( value <> 1 ); create domain dint4_0 as int check ( value <> 0 ); SELECT a, a::bool FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a dint4 EXISTS PATH '$.a' )); SELECT a, a::bool FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a dint4 EXISTS PATH '$.a' false ON ERROR)); SELECT a, a::bool FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a dint4 EXISTS PATH '$.a' ERROR ON ERROR)); SELECT a, a::bool FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a dint4_0 EXISTS PATH '$.a')); SELECT a, a::bool FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a dint4_0 EXISTS PATH '$')); SELECT a,a::bool FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a dint4_1 EXISTS PATH '$')); SELECT a,a::bool FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a dint4_1 EXISTS PATH '$.a')); SELECT a,a::bool FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a dint4_1 EXISTS PATH '$.a' ERROR ON ERROR));
drop domain if exists djs; create domain djs as jsonb check ( value <> '"11"' ); SELECT JSON_QUERY(jsonb '"aaa"', '$.a' RETURNING djs keep quotes DEFAULT '"11"' ON empty); SELECT JSON_QUERY(jsonb '"aaa"', '$.a' RETURNING djs omit quotes DEFAULT '"11"' ON empty); SELECT JSON_QUERY(jsonb '"11"', '$' RETURNING djs omit quotes DEFAULT '"11"' ON empty); SELECT JSON_QUERY(jsonb '"aaa"', '$.a' RETURNING jsonb keep quotes DEFAULT '"11"' ON empty); SELECT JSON_QUERY(jsonb '"aaa"', '$.a' RETURNING jsonb omit quotes DEFAULT '"11"' ON empty); SELECT JSON_QUERY(jsonb '"aaa"', '$.a' RETURNING int4range omit quotes DEFAULT '"[1,2]"'::jsonb ON empty); SELECT JSON_QUERY(jsonb '"aaa"', '$.a' RETURNING int4range keep quotes DEFAULT '"[1,2]"'::jsonb ON empty); SELECT JSON_value(jsonb '"aaa"', '$.a' RETURNING int4range DEFAULT '"[1,2]"'::jsonb ON empty); ---------------------------- I found out 2 issues for the above tests. 1. RETURNING types is jsonb/domain over jsonb, default expression does not respect omit/keep quotes, but other RETURNING types do. Maybe this will be fine. 2. domain over jsonb should fail just like domain over other types? RETURNING djs keep quotes DEFAULT '"11"' ON empty should fail as ERROR: could not coerce ON EMPTY expression (DEFAULT) to the RETURNING type DETAIL: value for domain djs violates check constraint "djs_check"" errcode(ERRCODE_CANNOT_COERCE), errmsg("cannot cast behavior expression of type %s to %s", format_type_be(exprType(expr)), format_type_be(returning->typid)), errhint("You will need to cast the expression."), parser_errposition(pstate, exprLocation(expr))); maybe errhint("You will need to explicitly cast the expression to type %s", format_type_be(returning->typid))
On Wed, Jul 24, 2024 at 3:25 PM jian he <jian.universality@gmail.com> wrote: > transformJsonFuncExpr we have: > case JSON_QUERY_OP: > if (jsexpr->returning->typid != JSONBOID || jsexpr->omit_quotes) > jsexpr->use_json_coercion = true; > > case JSON_VALUE_OP: > if (jsexpr->returning->typid != TEXTOID) > { > if (get_typtype(jsexpr->returning->typid) == TYPTYPE_DOMAIN && > DomainHasConstraints(jsexpr->returning->typid)) > jsexpr->use_json_coercion = true; > else > jsexpr->use_io_coercion = true; > } > > JSONBOID won't be a domain. for domain type, json_value, json_query > will use jsexpr->use_json_coercion. > jsexpr->use_json_coercion can handle whether the domain has constraints or not. > > so i don't know the purpose of following code in ExecInitJsonExpr > if (get_typtype(jsexpr->returning->typid) == TYPTYPE_DOMAIN && > DomainHasConstraints(jsexpr->returning->typid)) > { > Assert(jsexpr->use_json_coercion); > scratch->opcode = EEOP_JUMP; > scratch->d.jump.jumpdone = state->steps_len + 1; > ExprEvalPushStep(state, scratch); > } Yeah, it's a useless JUMP. I forget why it's there. I have attached a patch (0005) to remove it. > json_table exits works fine with int4, not domain over int4. The > following are test suites. > > drop domain if exists dint4, dint4_1,dint4_0; > create domain dint4 as int; > create domain dint4_1 as int check ( value <> 1 ); > create domain dint4_0 as int check ( value <> 0 ); > SELECT a, a::bool FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a dint4 > EXISTS PATH '$.a' )); > SELECT a, a::bool FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a dint4 > EXISTS PATH '$.a' false ON ERROR)); > SELECT a, a::bool FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a dint4 > EXISTS PATH '$.a' ERROR ON ERROR)); > SELECT a, a::bool FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a dint4_0 > EXISTS PATH '$.a')); > SELECT a, a::bool FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a dint4_0 > EXISTS PATH '$')); > SELECT a,a::bool FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a dint4_1 > EXISTS PATH '$')); > SELECT a,a::bool FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a dint4_1 > EXISTS PATH '$.a')); > SELECT a,a::bool FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a dint4_1 > EXISTS PATH '$.a' ERROR ON ERROR)); Domain-over-integer case should be fixed with the attached updated 0002. > I found out 2 issues for the above tests. > 1. RETURNING types is jsonb/domain over jsonb, default expression does > not respect omit/keep quotes, > but other RETURNING types do. Maybe this will be fine. Yeah, I am not sure whether and how we could implement OMIT/KEEP QUOTES for the DEFAULT expression. I might try later or simply document that OMIT/KEEP QUOTE is only applied to the query result but not the DEFAULT expression. > 2. domain over jsonb should fail just like domain over other types? > RETURNING djs keep quotes DEFAULT '"11"' ON empty > should fail as > ERROR: could not coerce ON EMPTY expression (DEFAULT) to the RETURNING type > DETAIL: value for domain djs violates check constraint "djs_check"" I think this should be fixed with the attached patch 0004. > errcode(ERRCODE_CANNOT_COERCE), > errmsg("cannot cast behavior expression of > type %s to %s", > format_type_be(exprType(expr)), > format_type_be(returning->typid)), > errhint("You will need to cast the expression."), > parser_errposition(pstate, exprLocation(expr))); > > maybe > errhint("You will need to explicitly cast the expression to type %s", > format_type_be(returning->typid)) OK, done. Please check. -- Thanks, Amit Langote
Attachment
- 0002-SQL-JSON-Fix-casting-for-integer-EXISTS-columns-in-J.patch
- 0001-SQL-JSON-Some-fixes-to-JsonBehavior-expression-casti.patch
- 0005-SQL-JSON-Remove-useless-code-in-ExecInitJsonExpr.patch
- 0004-SQL-JSON-Respect-OMIT-QUOTES-when-RETURNING-domain_t.patch
- 0003-SQL-JSON-Fix-handling-of-errors-coercing-JsonBehavio.patch
On Thu, Jul 25, 2024 at 11:16 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Wed, Jul 24, 2024 at 3:25 PM jian he <jian.universality@gmail.com> wrote: > > 2. domain over jsonb should fail just like domain over other types? > > RETURNING djs keep quotes DEFAULT '"11"' ON empty > > should fail as > > ERROR: could not coerce ON EMPTY expression (DEFAULT) to the RETURNING type > > DETAIL: value for domain djs violates check constraint "djs_check"" > > I think this should be fixed with the attached patch 0004. It is fixed but with the patch 0003, not 0004. Also, the test cases in 0004, which is a patch to fix a problem with OMIT QUOTES being disregarded when RETURNING domain-over-jsonb, didn't test that problem. So I have updated the test case to use a domain over jsonb. -- Thanks, Amit Langote
Attachment
- 0004-SQL-JSON-Respect-OMIT-QUOTES-when-RETURNING-domains-.patch
- 0005-SQL-JSON-Remove-useless-code-in-ExecInitJsonExpr.patch
- 0001-SQL-JSON-Some-fixes-to-JsonBehavior-expression-casti.patch
- 0003-SQL-JSON-Fix-handling-of-errors-coercing-JsonBehavio.patch
- 0002-SQL-JSON-Fix-casting-for-integer-EXISTS-columns-in-J.patch
On Fri, Jul 26, 2024 at 11:12 AM Amit Langote <amitlangote09@gmail.com> wrote: > On Thu, Jul 25, 2024 at 11:16 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Wed, Jul 24, 2024 at 3:25 PM jian he <jian.universality@gmail.com> wrote: > > > 2. domain over jsonb should fail just like domain over other types? > > > RETURNING djs keep quotes DEFAULT '"11"' ON empty > > > should fail as > > > ERROR: could not coerce ON EMPTY expression (DEFAULT) to the RETURNING type > > > DETAIL: value for domain djs violates check constraint "djs_check"" > > > > I think this should be fixed with the attached patch 0004. > > It is fixed but with the patch 0003, not 0004. > > Also, the test cases in 0004, which is a patch to fix a problem with > OMIT QUOTES being disregarded when RETURNING domain-over-jsonb, didn't > test that problem. So I have updated the test case to use a domain > over jsonb. Pushed 0003-0005 ahead of 0001-0002. Will try to push them over the weekend. Rebased for now. -- Thanks, Amit Langote
Attachment
On Fri, Jul 26, 2024 at 4:53 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > Pushed 0003-0005 ahead of 0001-0002. Will try to push them over the > weekend. Rebased for now. { ... /* * For expression nodes that support soft errors. Should be set to NULL * before calling ExecInitExprRec() if the caller wants errors thrown. */ ErrorSaveContext *escontext; } ExprState; i believe by default makeNode will set escontext to NULL. So the comment should be, if you want to catch the soft errors, make sure the escontext pointing to an allocated ErrorSaveContext. or maybe just say, default is NULL. Otherwise, the original comment's meaning feels like: we need to explicitly set it to NULL for certain operations, which I believe is false? struct { Oid targettype; int32 targettypmod; bool omit_quotes; bool exists_coerce; bool exists_cast_to_int; bool check_domain; void *json_coercion_cache; ErrorSaveContext *escontext; } jsonexpr_coercion; do we need to comment that "check_domain" is only used for JSON_EXISTS_OP? While reviewing it, I found another minor issue. json_behavior_type: ERROR_P { $$ = JSON_BEHAVIOR_ERROR; } | NULL_P { $$ = JSON_BEHAVIOR_NULL; } | TRUE_P { $$ = JSON_BEHAVIOR_TRUE; } | FALSE_P { $$ = JSON_BEHAVIOR_FALSE; } | UNKNOWN { $$ = JSON_BEHAVIOR_UNKNOWN; } | EMPTY_P ARRAY { $$ = JSON_BEHAVIOR_EMPTY_ARRAY; } | EMPTY_P OBJECT_P { $$ = JSON_BEHAVIOR_EMPTY_OBJECT; } /* non-standard, for Oracle compatibility only */ | EMPTY_P { $$ = JSON_BEHAVIOR_EMPTY_ARRAY; } ; EMPTY_P behaves the same as EMPTY_P ARRAY so for function GetJsonBehaviorConst, the following "case JSON_BEHAVIOR_EMPTY:" is wrong? case JSON_BEHAVIOR_NULL: case JSON_BEHAVIOR_UNKNOWN: case JSON_BEHAVIOR_EMPTY: val = (Datum) 0; isnull = true; typid = INT4OID; len = sizeof(int32); isbyval = true; break; also src/backend/utils/adt/ruleutils.c if (jexpr->on_error->btype != JSON_BEHAVIOR_EMPTY) get_json_behavior(jexpr->on_error, context, "ERROR"); for json_value, json_query, i believe we can save some circles in ExecInitJsonExpr if you don't specify on error, on empty can you please check the attached, based on your latest attachment.
Attachment
Hi, On Fri, Jul 26, 2024 at 11:19 PM jian he <jian.universality@gmail.com> wrote: > On Fri, Jul 26, 2024 at 4:53 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > > > > Pushed 0003-0005 ahead of 0001-0002. Will try to push them over the > > weekend. Rebased for now. Pushed them now. > { > ... > /* > * For expression nodes that support soft errors. Should be set to NULL > * before calling ExecInitExprRec() if the caller wants errors thrown. > */ > ErrorSaveContext *escontext; > } ExprState; > > i believe by default makeNode will set escontext to NULL. > So the comment should be, if you want to catch the soft errors, make > sure the escontext pointing to an allocated ErrorSaveContext. > or maybe just say, default is NULL. > > Otherwise, the original comment's meaning feels like: we need to > explicitly set it to NULL > for certain operations, which I believe is false? OK, I'll look into updating this. > struct > { > Oid targettype; > int32 targettypmod; > bool omit_quotes; > bool exists_coerce; > bool exists_cast_to_int; > bool check_domain; > void *json_coercion_cache; > ErrorSaveContext *escontext; > } jsonexpr_coercion; > do we need to comment that "check_domain" is only used for JSON_EXISTS_OP? I've renamed it to exists_check_domain and added a comment that exists_* fields are relevant only for JSON_EXISTS_OP. > json_behavior_type: > ERROR_P { $$ = JSON_BEHAVIOR_ERROR; } > | NULL_P { $$ = JSON_BEHAVIOR_NULL; } > | TRUE_P { $$ = JSON_BEHAVIOR_TRUE; } > | FALSE_P { $$ = JSON_BEHAVIOR_FALSE; } > | UNKNOWN { $$ = JSON_BEHAVIOR_UNKNOWN; } > | EMPTY_P ARRAY { $$ = JSON_BEHAVIOR_EMPTY_ARRAY; } > | EMPTY_P OBJECT_P { $$ = JSON_BEHAVIOR_EMPTY_OBJECT; } > /* non-standard, for Oracle compatibility only */ > | EMPTY_P { $$ = JSON_BEHAVIOR_EMPTY_ARRAY; } > ; > > EMPTY_P behaves the same as EMPTY_P ARRAY > so for function GetJsonBehaviorConst, the following "case > JSON_BEHAVIOR_EMPTY:" is wrong? > > case JSON_BEHAVIOR_NULL: > case JSON_BEHAVIOR_UNKNOWN: > case JSON_BEHAVIOR_EMPTY: > val = (Datum) 0; > isnull = true; > typid = INT4OID; > len = sizeof(int32); > isbyval = true; > break; > > also src/backend/utils/adt/ruleutils.c > if (jexpr->on_error->btype != JSON_BEHAVIOR_EMPTY) > get_json_behavior(jexpr->on_error, context, "ERROR"); Something like the attached makes sense? While this meaningfully changes the deparsing output, there is no behavior change for JsonTable top-level path execution. That's because the behavior when there's an error in the execution of the top-level path is to throw it or return an empty set, which is handled in jsonpath_exec.c, not execExprInterp.c. > for json_value, json_query, i believe we can save some circles in > ExecInitJsonExpr > if you don't specify on error, on empty > > can you please check the attached, based on your latest attachment. Perhaps makes sense, though I haven't checked closely. I'll take a look next week. -- Thanks, Amit Langote
On Tue, Jul 30, 2024 at 12:59 PM Amit Langote <amitlangote09@gmail.com> wrote: > > Hi, > > On Fri, Jul 26, 2024 at 11:19 PM jian he <jian.universality@gmail.com> wrote: > > On Fri, Jul 26, 2024 at 4:53 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > > > > > > > Pushed 0003-0005 ahead of 0001-0002. Will try to push them over the > > > weekend. Rebased for now. > > Pushed them now. > > > { > > ... > > /* > > * For expression nodes that support soft errors. Should be set to NULL > > * before calling ExecInitExprRec() if the caller wants errors thrown. > > */ > > ErrorSaveContext *escontext; > > } ExprState; > > > > i believe by default makeNode will set escontext to NULL. > > So the comment should be, if you want to catch the soft errors, make > > sure the escontext pointing to an allocated ErrorSaveContext. > > or maybe just say, default is NULL. > > > > Otherwise, the original comment's meaning feels like: we need to > > explicitly set it to NULL > > for certain operations, which I believe is false? > > OK, I'll look into updating this. > > > struct > > { > > Oid targettype; > > int32 targettypmod; > > bool omit_quotes; > > bool exists_coerce; > > bool exists_cast_to_int; > > bool check_domain; > > void *json_coercion_cache; > > ErrorSaveContext *escontext; > > } jsonexpr_coercion; > > do we need to comment that "check_domain" is only used for JSON_EXISTS_OP? > > I've renamed it to exists_check_domain and added a comment that > exists_* fields are relevant only for JSON_EXISTS_OP. > > > json_behavior_type: > > ERROR_P { $$ = JSON_BEHAVIOR_ERROR; } > > | NULL_P { $$ = JSON_BEHAVIOR_NULL; } > > | TRUE_P { $$ = JSON_BEHAVIOR_TRUE; } > > | FALSE_P { $$ = JSON_BEHAVIOR_FALSE; } > > | UNKNOWN { $$ = JSON_BEHAVIOR_UNKNOWN; } > > | EMPTY_P ARRAY { $$ = JSON_BEHAVIOR_EMPTY_ARRAY; } > > | EMPTY_P OBJECT_P { $$ = JSON_BEHAVIOR_EMPTY_OBJECT; } > > /* non-standard, for Oracle compatibility only */ > > | EMPTY_P { $$ = JSON_BEHAVIOR_EMPTY_ARRAY; } > > ; > > > > EMPTY_P behaves the same as EMPTY_P ARRAY > > so for function GetJsonBehaviorConst, the following "case > > JSON_BEHAVIOR_EMPTY:" is wrong? > > > > case JSON_BEHAVIOR_NULL: > > case JSON_BEHAVIOR_UNKNOWN: > > case JSON_BEHAVIOR_EMPTY: > > val = (Datum) 0; > > isnull = true; > > typid = INT4OID; > > len = sizeof(int32); > > isbyval = true; > > break; > > > > also src/backend/utils/adt/ruleutils.c > > if (jexpr->on_error->btype != JSON_BEHAVIOR_EMPTY) > > get_json_behavior(jexpr->on_error, context, "ERROR"); > > Something like the attached makes sense? While this meaningfully > changes the deparsing output, there is no behavior change for > JsonTable top-level path execution. That's because the behavior when > there's an error in the execution of the top-level path is to throw it > or return an empty set, which is handled in jsonpath_exec.c, not > execExprInterp.c. > hi amit. seems you forgot to attach the patch?
On Thu, Aug 22, 2024 at 11:02 jian he <jian.universality@gmail.com> wrote:
On Tue, Jul 30, 2024 at 12:59 PM Amit Langote <amitlangote09@gmail.com> wrote:
>
> Hi,
>
> On Fri, Jul 26, 2024 at 11:19 PM jian he <jian.universality@gmail.com> wrote:
> > On Fri, Jul 26, 2024 at 4:53 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > >
> > >
> > > Pushed 0003-0005 ahead of 0001-0002. Will try to push them over the
> > > weekend. Rebased for now.
>
> Pushed them now.
>
> > {
> > ...
> > /*
> > * For expression nodes that support soft errors. Should be set to NULL
> > * before calling ExecInitExprRec() if the caller wants errors thrown.
> > */
> > ErrorSaveContext *escontext;
> > } ExprState;
> >
> > i believe by default makeNode will set escontext to NULL.
> > So the comment should be, if you want to catch the soft errors, make
> > sure the escontext pointing to an allocated ErrorSaveContext.
> > or maybe just say, default is NULL.
> >
> > Otherwise, the original comment's meaning feels like: we need to
> > explicitly set it to NULL
> > for certain operations, which I believe is false?
>
> OK, I'll look into updating this.
>
> > struct
> > {
> > Oid targettype;
> > int32 targettypmod;
> > bool omit_quotes;
> > bool exists_coerce;
> > bool exists_cast_to_int;
> > bool check_domain;
> > void *json_coercion_cache;
> > ErrorSaveContext *escontext;
> > } jsonexpr_coercion;
> > do we need to comment that "check_domain" is only used for JSON_EXISTS_OP?
>
> I've renamed it to exists_check_domain and added a comment that
> exists_* fields are relevant only for JSON_EXISTS_OP.
>
> > json_behavior_type:
> > ERROR_P { $$ = JSON_BEHAVIOR_ERROR; }
> > | NULL_P { $$ = JSON_BEHAVIOR_NULL; }
> > | TRUE_P { $$ = JSON_BEHAVIOR_TRUE; }
> > | FALSE_P { $$ = JSON_BEHAVIOR_FALSE; }
> > | UNKNOWN { $$ = JSON_BEHAVIOR_UNKNOWN; }
> > | EMPTY_P ARRAY { $$ = JSON_BEHAVIOR_EMPTY_ARRAY; }
> > | EMPTY_P OBJECT_P { $$ = JSON_BEHAVIOR_EMPTY_OBJECT; }
> > /* non-standard, for Oracle compatibility only */
> > | EMPTY_P { $$ = JSON_BEHAVIOR_EMPTY_ARRAY; }
> > ;
> >
> > EMPTY_P behaves the same as EMPTY_P ARRAY
> > so for function GetJsonBehaviorConst, the following "case
> > JSON_BEHAVIOR_EMPTY:" is wrong?
> >
> > case JSON_BEHAVIOR_NULL:
> > case JSON_BEHAVIOR_UNKNOWN:
> > case JSON_BEHAVIOR_EMPTY:
> > val = (Datum) 0;
> > isnull = true;
> > typid = INT4OID;
> > len = sizeof(int32);
> > isbyval = true;
> > break;
> >
> > also src/backend/utils/adt/ruleutils.c
> > if (jexpr->on_error->btype != JSON_BEHAVIOR_EMPTY)
> > get_json_behavior(jexpr->on_error, context, "ERROR");
>
> Something like the attached makes sense? While this meaningfully
> changes the deparsing output, there is no behavior change for
> JsonTable top-level path execution. That's because the behavior when
> there's an error in the execution of the top-level path is to throw it
> or return an empty set, which is handled in jsonpath_exec.c, not
> execExprInterp.c.
>
hi amit.
seems you forgot to attach the patch?
Yeah, I had planned to look at this after my vacation earlier this month, but I context switched into working on another project and lost track of this. I’ll make some time next week to fix whatever remains go be fixed here. Thanks for the reminder.
On Mon, Sep 2, 2024 at 4:18 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > See 0001. > > > > > > See 0002. > > > > I'm also attaching 0003 to fix a minor annoyance that JSON_TABLE() > > columns' default ON ERROR, ON EMPTY behaviors are unnecessarily > > emitted in the deparsed output when the top-level ON ERROR behavior is > > ERROR. > > > > Will push these on Monday. v2-0001 looks good to me. +-- Test JSON_TABLE() column deparsing -- don't emit default ON ERROR / EMPTY +-- behavior +EXPLAIN VERBOSE SELECT * from JSON_TABLE('"a"', '$' COLUMNS (a text PATH '$')); +EXPLAIN VERBOSE SELECT * from JSON_TABLE('"a"', '$' COLUMNS (a text PATH '$') ERROR ON ERROR); Are these tests duplicated? appears both in v2-0002 and v2-0003. 0002 output is: +EXPLAIN VERBOSE SELECT * from JSON_TABLE('"a"', '$' COLUMNS (a text PATH '$') ERROR ON ERROR); + QUERY PLAN +------------------------------------------------------------------------------------------------------------------------------------------------ + Table Function Scan on "json_table" (cost=0.01..1.00 rows=100 width=32) + Output: a + Table Function Call: JSON_TABLE('"a"'::jsonb, '$' AS json_table_path_0 COLUMNS (a text PATH '$' NULL ON EMPTY NULL ON ERROR) ERROR ON ERROR) +(3 rows) 0003 output is: EXPLAIN VERBOSE SELECT * from JSON_TABLE('"a"', '$' COLUMNS (a text PATH '$') ERROR ON ERROR); QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Table Function Scan on "json_table" (cost=0.01..1.00 rows=100 width=32) Output: a Table Function Call: JSON_TABLE('"a"'::jsonb, '$' AS json_table_path_0 COLUMNS (a text PATH '$') ERROR ON ERROR) (3 rows) two patches with different output, overall we should merge 0002 and 0003? if (jsexpr->on_error && jsexpr->on_error->btype != JSON_BEHAVIOR_ERROR && (jsexpr->on_error->btype != JSON_BEHAVIOR_NULL || returning_domain)) we can be simplified as if ( jsexpr->on_error->btype != JSON_BEHAVIOR_ERROR && (jsexpr->on_error->btype != JSON_BEHAVIOR_NULL || returning_domain)) since if jsexpr->on_error is NULL, then segfault will appear at the beginning of ExecInitJsonExpr + * + * Only add the extra steps for a NULL-valued expression when RETURNING a + * domain type to check the constraints, if any. */ + jsestate->jump_error = state->steps_len; if (jsexpr->on_error && - jsexpr->on_error->btype != JSON_BEHAVIOR_ERROR) + jsexpr->on_error->btype != JSON_BEHAVIOR_ERROR && + (jsexpr->on_error->btype != JSON_BEHAVIOR_NULL || returning_domain)) + * + * Only add the extra steps for a NULL-valued expression when RETURNING a + * domain type to check the constraints, if any. */ + jsestate->jump_empty = state->steps_len; if (jsexpr->on_empty != NULL && - jsexpr->on_empty->btype != JSON_BEHAVIOR_ERROR) + jsexpr->on_empty->btype != JSON_BEHAVIOR_ERROR && + (jsexpr->on_empty->btype != JSON_BEHAVIOR_NULL || returning_domain)) I am a little bit confused with the comments. not sure the "NULL-valued expression" refers to. i think it is: implicitly default for ON EMPTY | ERROR clause is NULL (JSON_BEHAVIOR_NULL) for that situation, we can skip the json coercion process, but this only applies when the returning type of JsonExpr is not domain,
On Thu, Sep 5, 2024 at 9:58 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Tue, Sep 3, 2024 at 6:05 PM jian he <jian.universality@gmail.com> wrote: > > On Mon, Sep 2, 2024 at 4:18 PM Amit Langote <amitlangote09@gmail.com> wrote: > > v2-0001 looks good to me. > > +-- Test JSON_TABLE() column deparsing -- don't emit default ON ERROR / EMPTY > > +-- behavior > > +EXPLAIN VERBOSE SELECT * from JSON_TABLE('"a"', '$' COLUMNS (a text PATH '$')); > > +EXPLAIN VERBOSE SELECT * from JSON_TABLE('"a"', '$' COLUMNS (a text > > PATH '$') ERROR ON ERROR); > > > > Are these tests duplicated? appears both in v2-0002 and v2-0003. > > > > 0002 output is: > > +EXPLAIN VERBOSE SELECT * from JSON_TABLE('"a"', '$' COLUMNS (a text > > PATH '$') ERROR ON ERROR); > > + > > QUERY PLAN > > +------------------------------------------------------------------------------------------------------------------------------------------------ > > + Table Function Scan on "json_table" (cost=0.01..1.00 rows=100 width=32) > > + Output: a > > + Table Function Call: JSON_TABLE('"a"'::jsonb, '$' AS > > json_table_path_0 COLUMNS (a text PATH '$' NULL ON EMPTY NULL ON > > ERROR) ERROR ON ERROR) > > +(3 rows) > > > > 0003 output is: > > EXPLAIN VERBOSE SELECT * from JSON_TABLE('"a"', '$' COLUMNS (a text > > PATH '$') ERROR ON ERROR); > > QUERY PLAN > > -------------------------------------------------------------------------------------------------------------------- > > Table Function Scan on "json_table" (cost=0.01..1.00 rows=100 width=32) > > Output: a > > Table Function Call: JSON_TABLE('"a"'::jsonb, '$' AS > > json_table_path_0 COLUMNS (a text PATH '$') ERROR ON ERROR) > > (3 rows) > > > > two patches with different output, > > overall we should merge 0002 and 0003? > > Looks like I ordered the patches wrong. I'd like to commit the two separately. > > > if (jsexpr->on_error && > > jsexpr->on_error->btype != JSON_BEHAVIOR_ERROR && > > (jsexpr->on_error->btype != JSON_BEHAVIOR_NULL || returning_domain)) > > > > we can be simplified as > > if ( jsexpr->on_error->btype != JSON_BEHAVIOR_ERROR && > > (jsexpr->on_error->btype != JSON_BEHAVIOR_NULL || returning_domain)) > > > > since if jsexpr->on_error is NULL, then segfault will appear at the beginning of > > ExecInitJsonExpr > > Ok, done. > > > + * > > + * Only add the extra steps for a NULL-valued expression when RETURNING a > > + * domain type to check the constraints, if any. > > */ > > + jsestate->jump_error = state->steps_len; > > if (jsexpr->on_error && > > - jsexpr->on_error->btype != JSON_BEHAVIOR_ERROR) > > + jsexpr->on_error->btype != JSON_BEHAVIOR_ERROR && > > + (jsexpr->on_error->btype != JSON_BEHAVIOR_NULL || returning_domain)) > > > > + * > > + * Only add the extra steps for a NULL-valued expression when RETURNING a > > + * domain type to check the constraints, if any. > > */ > > + jsestate->jump_empty = state->steps_len; > > if (jsexpr->on_empty != NULL && > > - jsexpr->on_empty->btype != JSON_BEHAVIOR_ERROR) > > + jsexpr->on_empty->btype != JSON_BEHAVIOR_ERROR && > > + (jsexpr->on_empty->btype != JSON_BEHAVIOR_NULL || returning_domain)) > > > > I am a little bit confused with the comments. > > not sure the "NULL-valued expression" refers to. > > It refers to on_error/on_empty->expr, which is a Const node encoding > the NULL (constisnull is true) for that behavior. > > That's actually the case also for behaviors UNKNOWN and EMPTY, so the > condition should actually be checking whether the expr is a > NULL-valued expression. > > I've updated the patch. > > > i think it is: > > implicitly default for ON EMPTY | ERROR clause is NULL (JSON_BEHAVIOR_NULL) > > for that situation, we can skip the json coercion process, > > but this only applies when the returning type of JsonExpr is not domain, > > I've reworded the comment to mention that this speeds up the default > ON ERROR / EMPTY handling for JSON_QUERY() and JSON_VALUE(). > > Plan to push these tomorrow. Pushed. -- Thanks, Amit Langote
On Fri, Sep 6, 2024 at 12:07 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Thu, Sep 5, 2024 at 9:58 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Tue, Sep 3, 2024 at 6:05 PM jian he <jian.universality@gmail.com> wrote: > > > On Mon, Sep 2, 2024 at 4:18 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > v2-0001 looks good to me. > > > +-- Test JSON_TABLE() column deparsing -- don't emit default ON ERROR / EMPTY > > > +-- behavior > > > +EXPLAIN VERBOSE SELECT * from JSON_TABLE('"a"', '$' COLUMNS (a text PATH '$')); > > > +EXPLAIN VERBOSE SELECT * from JSON_TABLE('"a"', '$' COLUMNS (a text > > > PATH '$') ERROR ON ERROR); > > > > > > Are these tests duplicated? appears both in v2-0002 and v2-0003. > > > > > > 0002 output is: > > > +EXPLAIN VERBOSE SELECT * from JSON_TABLE('"a"', '$' COLUMNS (a text > > > PATH '$') ERROR ON ERROR); > > > + > > > QUERY PLAN > > > +------------------------------------------------------------------------------------------------------------------------------------------------ > > > + Table Function Scan on "json_table" (cost=0.01..1.00 rows=100 width=32) > > > + Output: a > > > + Table Function Call: JSON_TABLE('"a"'::jsonb, '$' AS > > > json_table_path_0 COLUMNS (a text PATH '$' NULL ON EMPTY NULL ON > > > ERROR) ERROR ON ERROR) > > > +(3 rows) > > > > > > 0003 output is: > > > EXPLAIN VERBOSE SELECT * from JSON_TABLE('"a"', '$' COLUMNS (a text > > > PATH '$') ERROR ON ERROR); > > > QUERY PLAN > > > -------------------------------------------------------------------------------------------------------------------- > > > Table Function Scan on "json_table" (cost=0.01..1.00 rows=100 width=32) > > > Output: a > > > Table Function Call: JSON_TABLE('"a"'::jsonb, '$' AS > > > json_table_path_0 COLUMNS (a text PATH '$') ERROR ON ERROR) > > > (3 rows) > > > > > > two patches with different output, > > > overall we should merge 0002 and 0003? > > > > Looks like I ordered the patches wrong. I'd like to commit the two separately. > > > > > if (jsexpr->on_error && > > > jsexpr->on_error->btype != JSON_BEHAVIOR_ERROR && > > > (jsexpr->on_error->btype != JSON_BEHAVIOR_NULL || returning_domain)) > > > > > > we can be simplified as > > > if ( jsexpr->on_error->btype != JSON_BEHAVIOR_ERROR && > > > (jsexpr->on_error->btype != JSON_BEHAVIOR_NULL || returning_domain)) > > > > > > since if jsexpr->on_error is NULL, then segfault will appear at the beginning of > > > ExecInitJsonExpr > > > > Ok, done. > > > > > + * > > > + * Only add the extra steps for a NULL-valued expression when RETURNING a > > > + * domain type to check the constraints, if any. > > > */ > > > + jsestate->jump_error = state->steps_len; > > > if (jsexpr->on_error && > > > - jsexpr->on_error->btype != JSON_BEHAVIOR_ERROR) > > > + jsexpr->on_error->btype != JSON_BEHAVIOR_ERROR && > > > + (jsexpr->on_error->btype != JSON_BEHAVIOR_NULL || returning_domain)) > > > > > > + * > > > + * Only add the extra steps for a NULL-valued expression when RETURNING a > > > + * domain type to check the constraints, if any. > > > */ > > > + jsestate->jump_empty = state->steps_len; > > > if (jsexpr->on_empty != NULL && > > > - jsexpr->on_empty->btype != JSON_BEHAVIOR_ERROR) > > > + jsexpr->on_empty->btype != JSON_BEHAVIOR_ERROR && > > > + (jsexpr->on_empty->btype != JSON_BEHAVIOR_NULL || returning_domain)) > > > > > > I am a little bit confused with the comments. > > > not sure the "NULL-valued expression" refers to. > > > > It refers to on_error/on_empty->expr, which is a Const node encoding > > the NULL (constisnull is true) for that behavior. > > > > That's actually the case also for behaviors UNKNOWN and EMPTY, so the > > condition should actually be checking whether the expr is a > > NULL-valued expression. > > > > I've updated the patch. > > > > > i think it is: > > > implicitly default for ON EMPTY | ERROR clause is NULL (JSON_BEHAVIOR_NULL) > > > for that situation, we can skip the json coercion process, > > > but this only applies when the returning type of JsonExpr is not domain, > > > > I've reworded the comment to mention that this speeds up the default > > ON ERROR / EMPTY handling for JSON_QUERY() and JSON_VALUE(). > > > > Plan to push these tomorrow. > > Pushed. Reverted 0002-0004 from both master and REL_17_STABLE due to BF failures. 0002-0003 are easily fixed by changing the newly added tests to not use EXPLAIN VERBOSE to test deparsing related changes, so will re-push those shortly. 0004 perhaps doesn't play nicely with LLVM compilation but I don't yet understand why. -- Thanks, Amit Langote