Re: Casts from jsonb to other types should cope with json null - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Casts from jsonb to other types should cope with json null
Date
Msg-id 224048.1722715813@sss.pgh.pa.us
Whole thread Raw
In response to Casts from jsonb to other types should cope with json null  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-hackers
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Thursday, August 1, 2024, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think what is happening there is you're getting the fallback
>> "cast via I/O" behavior.  There's no jsonb->text cast function
>> in the catalogs.
>> Perhaps it's worth adding one, so that it can be made to behave
>> similarly to the casts to other types.

> I’m not too keen on opening Pandora’s box here even if I do regret our
> current choices.  Semantic casting of json scalar strings only, and doing
> document serialization as a function, would have been better in hindsight.
> ...
> I read the discussion thread [1] that added this and while one person
> mentioned json null no one replied to that point and seemingly no explicit
> consideration for treating json null semantically was ever done - i.e. this
> fails only because in json null has its own type, and the test were type,
> not value, oriented.  As SQL null is a value only, whose type is whatever
> holds it, I’d argue our lack of doing this even constitutes a bug but
> wouldn’t - and turning errors into non-errors has a lower “bug acceptance
> threshold”.

Yeah, it's clear that this wasn't thought about too hard, because
I discovered that changing this behavior affects exactly zero existing
regression test cases :-(.  But I still think we should probably
change it.  Aside from ->>, we have other operators/functions that
convert jsonb values to SQL types, such as #>>,
jsonb_array_elements_text, jsonb_each_text, and AFAICS every one
of those translates JSON null to SQL NULL.

Attached are some actual patches for this.  0001 just changes the
existing json-to-scalar casts; since those previously threw
an error, I think that change is not too controversial.  0002 is
POC for changing the behavior of jsonb::text.  I don't think
it's committable as-is, because we have the same behavior for
jsonb::varchar and other string-category target types.  That's
not all that hard to fix, but I've not done so pending a decision
on whether we want 0002.

It strikes me that there's also a question of whether json::text
should translate 'null' like this.  I'm inclined to think not,
since json is in some sense a domain over text, but it's certainly
debatable.

            regards, tom lane

From e3c78ab3e9f1f7cbfec9b04933fd774ae2008780 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sat, 3 Aug 2024 14:43:46 -0400
Subject: [PATCH v1 1/2] Make jsonb casts to scalar types translate JSON null
 to SQL NULL.

Formerly, these cases threw an error "cannot cast jsonb null to type
<whatever>".  That seems less than helpful though.  It's also
inconsistent with the behavior of the ->> operator, which translates
JSON null to SQL NULL rather than 'null'.

Discussion: https://postgr.es/m/3851203.1722552717@sss.pgh.pa.us
---
 src/backend/utils/adt/jsonb.c       | 77 ++++++++++++++++++++++++++---
 src/test/regress/expected/jsonb.out | 66 +++++++++++++++++++++++++
 src/test/regress/sql/jsonb.sql      | 13 +++++
 3 files changed, 149 insertions(+), 7 deletions(-)

diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c
index 928552d551..ed054d5d42 100644
--- a/src/backend/utils/adt/jsonb.c
+++ b/src/backend/utils/adt/jsonb.c
@@ -2040,7 +2040,16 @@ jsonb_bool(PG_FUNCTION_ARGS)
     Jsonb       *in = PG_GETARG_JSONB_P(0);
     JsonbValue    v;

-    if (!JsonbExtractScalar(&in->root, &v) || v.type != jbvBool)
+    if (!JsonbExtractScalar(&in->root, &v))
+        cannotCastJsonbValue(v.type, "boolean");
+
+    if (v.type == jbvNull)
+    {
+        PG_FREE_IF_COPY(in, 0);
+        PG_RETURN_NULL();
+    }
+
+    if (v.type != jbvBool)
         cannotCastJsonbValue(v.type, "boolean");

     PG_FREE_IF_COPY(in, 0);
@@ -2055,7 +2064,16 @@ jsonb_numeric(PG_FUNCTION_ARGS)
     JsonbValue    v;
     Numeric        retValue;

-    if (!JsonbExtractScalar(&in->root, &v) || v.type != jbvNumeric)
+    if (!JsonbExtractScalar(&in->root, &v))
+        cannotCastJsonbValue(v.type, "numeric");
+
+    if (v.type == jbvNull)
+    {
+        PG_FREE_IF_COPY(in, 0);
+        PG_RETURN_NULL();
+    }
+
+    if (v.type != jbvNumeric)
         cannotCastJsonbValue(v.type, "numeric");

     /*
@@ -2076,7 +2094,16 @@ jsonb_int2(PG_FUNCTION_ARGS)
     JsonbValue    v;
     Datum        retValue;

-    if (!JsonbExtractScalar(&in->root, &v) || v.type != jbvNumeric)
+    if (!JsonbExtractScalar(&in->root, &v))
+        cannotCastJsonbValue(v.type, "smallint");
+
+    if (v.type == jbvNull)
+    {
+        PG_FREE_IF_COPY(in, 0);
+        PG_RETURN_NULL();
+    }
+
+    if (v.type != jbvNumeric)
         cannotCastJsonbValue(v.type, "smallint");

     retValue = DirectFunctionCall1(numeric_int2,
@@ -2094,7 +2121,16 @@ jsonb_int4(PG_FUNCTION_ARGS)
     JsonbValue    v;
     Datum        retValue;

-    if (!JsonbExtractScalar(&in->root, &v) || v.type != jbvNumeric)
+    if (!JsonbExtractScalar(&in->root, &v))
+        cannotCastJsonbValue(v.type, "integer");
+
+    if (v.type == jbvNull)
+    {
+        PG_FREE_IF_COPY(in, 0);
+        PG_RETURN_NULL();
+    }
+
+    if (v.type != jbvNumeric)
         cannotCastJsonbValue(v.type, "integer");

     retValue = DirectFunctionCall1(numeric_int4,
@@ -2112,7 +2148,16 @@ jsonb_int8(PG_FUNCTION_ARGS)
     JsonbValue    v;
     Datum        retValue;

-    if (!JsonbExtractScalar(&in->root, &v) || v.type != jbvNumeric)
+    if (!JsonbExtractScalar(&in->root, &v))
+        cannotCastJsonbValue(v.type, "bigint");
+
+    if (v.type == jbvNull)
+    {
+        PG_FREE_IF_COPY(in, 0);
+        PG_RETURN_NULL();
+    }
+
+    if (v.type != jbvNumeric)
         cannotCastJsonbValue(v.type, "bigint");

     retValue = DirectFunctionCall1(numeric_int8,
@@ -2130,7 +2175,16 @@ jsonb_float4(PG_FUNCTION_ARGS)
     JsonbValue    v;
     Datum        retValue;

-    if (!JsonbExtractScalar(&in->root, &v) || v.type != jbvNumeric)
+    if (!JsonbExtractScalar(&in->root, &v))
+        cannotCastJsonbValue(v.type, "real");
+
+    if (v.type == jbvNull)
+    {
+        PG_FREE_IF_COPY(in, 0);
+        PG_RETURN_NULL();
+    }
+
+    if (v.type != jbvNumeric)
         cannotCastJsonbValue(v.type, "real");

     retValue = DirectFunctionCall1(numeric_float4,
@@ -2148,7 +2202,16 @@ jsonb_float8(PG_FUNCTION_ARGS)
     JsonbValue    v;
     Datum        retValue;

-    if (!JsonbExtractScalar(&in->root, &v) || v.type != jbvNumeric)
+    if (!JsonbExtractScalar(&in->root, &v))
+        cannotCastJsonbValue(v.type, "double precision");
+
+    if (v.type == jbvNull)
+    {
+        PG_FREE_IF_COPY(in, 0);
+        PG_RETURN_NULL();
+    }
+
+    if (v.type != jbvNumeric)
         cannotCastJsonbValue(v.type, "double precision");

     retValue = DirectFunctionCall1(numeric_float8,
diff --git a/src/test/regress/expected/jsonb.out b/src/test/regress/expected/jsonb.out
index e66d760189..c0bd0e76ae 100644
--- a/src/test/regress/expected/jsonb.out
+++ b/src/test/regress/expected/jsonb.out
@@ -5599,6 +5599,12 @@ select 'true'::jsonb::bool;
  t
 (1 row)

+select 'null'::jsonb::bool;
+ bool
+------
+
+(1 row)
+
 select '[]'::jsonb::bool;
 ERROR:  cannot cast jsonb array to type boolean
 select '1.0'::jsonb::float;
@@ -5607,22 +5613,82 @@ select '1.0'::jsonb::float;
       1
 (1 row)

+select 'null'::jsonb::float;
+ float8
+--------
+
+(1 row)
+
 select '[1.0]'::jsonb::float;
 ERROR:  cannot cast jsonb array to type double precision
+select '1.0'::jsonb::float4;
+ float4
+--------
+      1
+(1 row)
+
+select 'null'::jsonb::float4;
+ float4
+--------
+
+(1 row)
+
+select '[1.0]'::jsonb::float4;
+ERROR:  cannot cast jsonb array to type real
+select '12345'::jsonb::int2;
+ int2
+-------
+ 12345
+(1 row)
+
+select 'null'::jsonb::int2;
+ int2
+------
+
+(1 row)
+
+select '"hello"'::jsonb::int2;
+ERROR:  cannot cast jsonb string to type smallint
 select '12345'::jsonb::int4;
  int4
 -------
  12345
 (1 row)

+select 'null'::jsonb::int4;
+ int4
+------
+
+(1 row)
+
 select '"hello"'::jsonb::int4;
 ERROR:  cannot cast jsonb string to type integer
+select '12345'::jsonb::int8;
+ int8
+-------
+ 12345
+(1 row)
+
+select 'null'::jsonb::int8;
+ int8
+------
+
+(1 row)
+
+select '"hello"'::jsonb::int8;
+ERROR:  cannot cast jsonb string to type bigint
 select '12345'::jsonb::numeric;
  numeric
 ---------
    12345
 (1 row)

+select 'null'::jsonb::numeric;
+ numeric
+---------
+
+(1 row)
+
 select '{}'::jsonb::numeric;
 ERROR:  cannot cast jsonb object to type numeric
 select '12345.05'::jsonb::numeric;
diff --git a/src/test/regress/sql/jsonb.sql b/src/test/regress/sql/jsonb.sql
index 97bc2242a1..1bcafe8cfb 100644
--- a/src/test/regress/sql/jsonb.sql
+++ b/src/test/regress/sql/jsonb.sql
@@ -1537,12 +1537,25 @@ select ts_headline('[]'::jsonb, tsquery('aaa & bbb'));

 -- casts
 select 'true'::jsonb::bool;
+select 'null'::jsonb::bool;
 select '[]'::jsonb::bool;
 select '1.0'::jsonb::float;
+select 'null'::jsonb::float;
 select '[1.0]'::jsonb::float;
+select '1.0'::jsonb::float4;
+select 'null'::jsonb::float4;
+select '[1.0]'::jsonb::float4;
+select '12345'::jsonb::int2;
+select 'null'::jsonb::int2;
+select '"hello"'::jsonb::int2;
 select '12345'::jsonb::int4;
+select 'null'::jsonb::int4;
 select '"hello"'::jsonb::int4;
+select '12345'::jsonb::int8;
+select 'null'::jsonb::int8;
+select '"hello"'::jsonb::int8;
 select '12345'::jsonb::numeric;
+select 'null'::jsonb::numeric;
 select '{}'::jsonb::numeric;
 select '12345.05'::jsonb::numeric;
 select '12345.05'::jsonb::float4;
--
2.43.5

From 8fca52bafc5b6b8dfc8d757c80a359202a56226c Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sat, 3 Aug 2024 15:53:11 -0400
Subject: [PATCH v1 2/2] Make jsonb cast to text translate JSON null to SQL
 NULL.

Previously, if the input was a JSON scalar null, you got the
string 'null'.  While that's not unreasonable in isolation, it's
inconsistent with the behavior of the ->> operator.  I think
that (jsonb->'fld')::text should produce results identical to
jsonb->>'fld', but up to now they disagree on this point.
Some other jsonb processing functions such as jsonb_array_elements
also translate JSON null to SQL NULL, and the preceding patch
in this series applies that rule to other jsonb-to-scalar casts.

This patch is incomplete: it only touches the behavior for casting
to text.  The pre-existing cast-via-I/O behavior applies to any
string-category result type, so if we want consistency we'd better
create explicit cast rules for varchar, bpchar, and name as well
as text.  (The regression tests added here demonstrate this
inconsistency.)  I've not done that, pending agreement on whether
we want this behavioral change at all.

Discussion: https://postgr.es/m/3851203.1722552717@sss.pgh.pa.us
---
 src/backend/utils/adt/jsonb.c       | 23 +++++++++
 src/include/catalog/pg_cast.dat     |  5 +-
 src/include/catalog/pg_proc.dat     |  3 ++
 src/test/regress/expected/jsonb.out | 72 +++++++++++++++++++++++++++++
 src/test/regress/sql/jsonb.sql      | 12 +++++
 5 files changed, 114 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c
index ed054d5d42..c4e8db0d24 100644
--- a/src/backend/utils/adt/jsonb.c
+++ b/src/backend/utils/adt/jsonb.c
@@ -2222,6 +2222,29 @@ jsonb_float8(PG_FUNCTION_ARGS)
     PG_RETURN_DATUM(retValue);
 }

+Datum
+jsonb_text(PG_FUNCTION_ARGS)
+{
+    Jsonb       *in = PG_GETARG_JSONB_P(0);
+    JsonbValue    v;
+    StringInfoData jtext;
+
+    /* Convert scalar null to SQL null */
+    if (JsonbExtractScalar(&in->root, &v) && v.type == jbvNull)
+    {
+        PG_FREE_IF_COPY(in, 0);
+        PG_RETURN_NULL();
+    }
+
+    /* Every other case acts like jsonb_out() */
+    initStringInfo(&jtext);
+    (void) JsonbToCString(&jtext, &in->root, VARSIZE(in));
+
+    PG_FREE_IF_COPY(in, 0);
+
+    PG_RETURN_TEXT_P(cstring_to_text_with_len(jtext.data, jtext.len));
+}
+
 /*
  * Convert jsonb to a C-string stripping quotes from scalar strings.
  */
diff --git a/src/include/catalog/pg_cast.dat b/src/include/catalog/pg_cast.dat
index ca7b6d7191..bffe17ead0 100644
--- a/src/include/catalog/pg_cast.dat
+++ b/src/include/catalog/pg_cast.dat
@@ -512,7 +512,7 @@
 { castsource => 'jsonb', casttarget => 'json', castfunc => '0',
   castcontext => 'a', castmethod => 'i' },

-# jsonb to numeric and bool types
+# jsonb to various scalar types
 { castsource => 'jsonb', casttarget => 'bool', castfunc => 'bool(jsonb)',
   castcontext => 'e', castmethod => 'f' },
 { castsource => 'jsonb', casttarget => 'numeric', castfunc => 'numeric(jsonb)',
@@ -527,6 +527,9 @@
   castcontext => 'e', castmethod => 'f' },
 { castsource => 'jsonb', casttarget => 'float8', castfunc => 'float8(jsonb)',
   castcontext => 'e', castmethod => 'f' },
+# this cast replaces an implicit COERCEVIAIO cast, so must be assignment-level:
+{ castsource => 'jsonb', casttarget => 'text', castfunc => 'text(jsonb)',
+  castcontext => 'a', castmethod => 'f' },

 # range to multirange
 { castsource => 'int4range', casttarget => 'int4multirange',
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d36f6001bb..c91824724c 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -4659,6 +4659,9 @@
 { oid => '2580', descr => 'convert jsonb to float8',
   proname => 'float8', prorettype => 'float8', proargtypes => 'jsonb',
   prosrc => 'jsonb_float8' },
+{ oid => '8079', descr => 'convert jsonb to text',
+  proname => 'text', prorettype => 'text', proargtypes => 'jsonb',
+  prosrc => 'jsonb_text' },

 # formatting
 { oid => '1770', descr => 'format timestamp with time zone to text',
diff --git a/src/test/regress/expected/jsonb.out b/src/test/regress/expected/jsonb.out
index c0bd0e76ae..dda3ee5b95 100644
--- a/src/test/regress/expected/jsonb.out
+++ b/src/test/regress/expected/jsonb.out
@@ -5763,3 +5763,75 @@ select '12345.0000000000000000000000000000000000000000000005'::jsonb::int8;
  12345
 (1 row)

+select 'true'::jsonb::text;
+ text
+------
+ true
+(1 row)
+
+select '1.0'::jsonb::text;
+ text
+------
+ 1.0
+(1 row)
+
+select '"hello"'::jsonb::text;
+  text
+---------
+ "hello"
+(1 row)
+
+select 'null'::jsonb::text;
+ text
+------
+
+(1 row)
+
+select '[1.0]'::jsonb::text;
+ text
+-------
+ [1.0]
+(1 row)
+
+select '{"a": "b"}'::jsonb::text;
+    text
+------------
+ {"a": "b"}
+(1 row)
+
+select 'true'::jsonb::varchar;
+ varchar
+---------
+ true
+(1 row)
+
+select '1.0'::jsonb::varchar;
+ varchar
+---------
+ 1.0
+(1 row)
+
+select '"hello"'::jsonb::varchar;
+ varchar
+---------
+ "hello"
+(1 row)
+
+select 'null'::jsonb::varchar;  -- not the desired behavior
+ varchar
+---------
+ null
+(1 row)
+
+select '[1.0]'::jsonb::varchar;
+ varchar
+---------
+ [1.0]
+(1 row)
+
+select '{"a": "b"}'::jsonb::varchar;
+  varchar
+------------
+ {"a": "b"}
+(1 row)
+
diff --git a/src/test/regress/sql/jsonb.sql b/src/test/regress/sql/jsonb.sql
index 1bcafe8cfb..57240c4146 100644
--- a/src/test/regress/sql/jsonb.sql
+++ b/src/test/regress/sql/jsonb.sql
@@ -1569,3 +1569,15 @@ select '12345.0000000000000000000000000000000000000000000005'::jsonb::float8;
 select '12345.0000000000000000000000000000000000000000000005'::jsonb::int2;
 select '12345.0000000000000000000000000000000000000000000005'::jsonb::int4;
 select '12345.0000000000000000000000000000000000000000000005'::jsonb::int8;
+select 'true'::jsonb::text;
+select '1.0'::jsonb::text;
+select '"hello"'::jsonb::text;
+select 'null'::jsonb::text;
+select '[1.0]'::jsonb::text;
+select '{"a": "b"}'::jsonb::text;
+select 'true'::jsonb::varchar;
+select '1.0'::jsonb::varchar;
+select '"hello"'::jsonb::varchar;
+select 'null'::jsonb::varchar;  -- not the desired behavior
+select '[1.0]'::jsonb::varchar;
+select '{"a": "b"}'::jsonb::varchar;
--
2.43.5


pgsql-hackers by date:

Previous
From: Dmitry Dolgov
Date:
Subject: Re: Adding skip scan (including MDAM style range skip scan) to nbtree
Next
From: Tom Lane
Date:
Subject: Re: consider -Wmissing-variable-declarations