Thread: Weird special case in jsonb_concat()
The discussion in [1] pointed out that the existing documentation for the "jsonb || jsonb" concatenation operator is far short of reality: it fails to acknowledge that the operator will accept any cases other than two jsonb array inputs or two jsonb object inputs. I'd about concluded that other cases were handled as if by wrapping non-array inputs in one-element arrays and then proceeding as for two arrays. That works for most scenarios, eg regression=# select '[3]'::jsonb || '{}'::jsonb; ?column? ---------- [3, {}] (1 row) regression=# select '3'::jsonb || '[]'::jsonb; ?column? ---------- [3] (1 row) regression=# select '3'::jsonb || '4'::jsonb; ?column? ---------- [3, 4] (1 row) However, further experimentation found a case that fails: regression=# select '3'::jsonb || '{}'::jsonb; ERROR: invalid concatenation of jsonb objects I wonder what is the point of this weird exception, and whether whoever devised it can provide a concise explanation of what they think the full behavior of "jsonb || jsonb" is. Why isn't '[3, {}]' a reasonable result here, if the cases above are OK? regards, tom lane [1] https://www.postgresql.org/message-id/flat/0d72b76d-ca2b-4263-8888-d6dfca861c51%40www.fastmail.com
I wrote: > However, further experimentation found a case that fails: > regression=# select '3'::jsonb || '{}'::jsonb; > ERROR: invalid concatenation of jsonb objects > I wonder what is the point of this weird exception, and whether > whoever devised it can provide a concise explanation of what > they think the full behavior of "jsonb || jsonb" is. Why isn't > '[3, {}]' a reasonable result here, if the cases above are OK? Here is a proposed patch for that. It turns out that the third else-branch in IteratorConcat() already does the right thing, if we just remove its restrictive else-condition and let it handle everything except the two-objects and two-arrays cases. But it seemed to me that trying to handle both the object || array and array || object cases in that one else-branch was poorly thought out: only one line of code can actually be shared, and it took several extra lines of infrastructure to support the sharing. So I split those cases into separate else-branches. This also addresses the inadequate documentation that was the original complaint. Thoughts? Should we back-patch this? The existing behavior seems to me to be inconsistent enough to be arguably a bug, but we've not had field complaints saying "this should work". regards, tom lane diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index df29af6371..5f3c393a25 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -14715,8 +14715,12 @@ table2-mapping </para> <para> Concatenates two <type>jsonb</type> values. - Concatenating two objects generates an object with the union of their + Concatenating two arrays generates an array containing all the + elements of each input. Concatenating two objects generates an + object containing the union of their keys, taking the second object's value when there are duplicate keys. + All other cases are treated by converting a non-array input into a + single-element array, and then proceeding as for two arrays. Does not operate recursively: only the top-level array or object structure is merged. </para> @@ -14727,6 +14731,14 @@ table2-mapping <para> <literal>'{"a": "b"}'::jsonb || '{"c": "d"}'::jsonb</literal> <returnvalue>{"a": "b", "c": "d"}</returnvalue> + </para> + <para> + <literal>'[1, 2]'::jsonb || '3'::jsonb</literal> + <returnvalue>[1, 2, 3]</returnvalue> + </para> + <para> + <literal>'{"a": "b"}'::jsonb || '42'::jsonb</literal> + <returnvalue>[{"a": "b"}, 42]</returnvalue> </para></entry> </row> diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c index 7a25415078..5beade00f4 100644 --- a/src/backend/utils/adt/jsonfuncs.c +++ b/src/backend/utils/adt/jsonfuncs.c @@ -4690,11 +4690,14 @@ IteratorConcat(JsonbIterator **it1, JsonbIterator **it2, rk2 = JsonbIteratorNext(it2, &v2, false); /* - * Both elements are objects. + * JsonbIteratorNext reports raw scalars as if they were single-element + * arrays; hence we only need consider "object" and "array" cases here. */ if (rk1 == WJB_BEGIN_OBJECT && rk2 == WJB_BEGIN_OBJECT) { /* + * Both elements are objects. + * * Append all the tokens from v1 to res, except last WJB_END_OBJECT * (because res will not be finished yet). */ @@ -4703,18 +4706,18 @@ IteratorConcat(JsonbIterator **it1, JsonbIterator **it2, pushJsonbValue(state, r1, &v1); /* - * Append all the tokens from v2 to res, include last WJB_END_OBJECT - * (the concatenation will be completed). + * Append all the tokens from v2 to res, including last WJB_END_OBJECT + * (the concatenation will be completed). Any duplicate keys will + * automatically override the value from the first object. */ while ((r2 = JsonbIteratorNext(it2, &v2, true)) != WJB_DONE) res = pushJsonbValue(state, r2, r2 != WJB_END_OBJECT ? &v2 : NULL); } - - /* - * Both elements are arrays (either can be scalar). - */ else if (rk1 == WJB_BEGIN_ARRAY && rk2 == WJB_BEGIN_ARRAY) { + /* + * Both elements are arrays. + */ pushJsonbValue(state, rk1, NULL); while ((r1 = JsonbIteratorNext(it1, &v1, true)) != WJB_END_ARRAY) @@ -4731,46 +4734,40 @@ IteratorConcat(JsonbIterator **it1, JsonbIterator **it2, res = pushJsonbValue(state, WJB_END_ARRAY, NULL /* signal to sort */ ); } - /* have we got array || object or object || array? */ - else if (((rk1 == WJB_BEGIN_ARRAY && !(*it1)->isScalar) && rk2 == WJB_BEGIN_OBJECT) || - (rk1 == WJB_BEGIN_OBJECT && (rk2 == WJB_BEGIN_ARRAY && !(*it2)->isScalar))) + else if (rk1 == WJB_BEGIN_OBJECT) { - JsonbIterator **it_array = rk1 == WJB_BEGIN_ARRAY ? it1 : it2; - JsonbIterator **it_object = rk1 == WJB_BEGIN_OBJECT ? it1 : it2; - bool prepend = (rk1 == WJB_BEGIN_OBJECT); + /* + * We have object || array. + */ + Assert(rk2 == WJB_BEGIN_ARRAY); pushJsonbValue(state, WJB_BEGIN_ARRAY, NULL); - if (prepend) - { - pushJsonbValue(state, WJB_BEGIN_OBJECT, NULL); - while ((r1 = JsonbIteratorNext(it_object, &v1, true)) != WJB_DONE) - pushJsonbValue(state, r1, r1 != WJB_END_OBJECT ? &v1 : NULL); - - while ((r2 = JsonbIteratorNext(it_array, &v2, true)) != WJB_DONE) - res = pushJsonbValue(state, r2, r2 != WJB_END_ARRAY ? &v2 : NULL); - } - else - { - while ((r1 = JsonbIteratorNext(it_array, &v1, true)) != WJB_END_ARRAY) - pushJsonbValue(state, r1, &v1); - - pushJsonbValue(state, WJB_BEGIN_OBJECT, NULL); - while ((r2 = JsonbIteratorNext(it_object, &v2, true)) != WJB_DONE) - pushJsonbValue(state, r2, r2 != WJB_END_OBJECT ? &v2 : NULL); + pushJsonbValue(state, WJB_BEGIN_OBJECT, NULL); + while ((r1 = JsonbIteratorNext(it1, &v1, true)) != WJB_DONE) + pushJsonbValue(state, r1, r1 != WJB_END_OBJECT ? &v1 : NULL); - res = pushJsonbValue(state, WJB_END_ARRAY, NULL); - } + while ((r2 = JsonbIteratorNext(it2, &v2, true)) != WJB_DONE) + res = pushJsonbValue(state, r2, r2 != WJB_END_ARRAY ? &v2 : NULL); } else { /* - * This must be scalar || object or object || scalar, as that's all - * that's left. Both of these make no sense, so error out. + * We have array || object. */ - ereport(ERROR, - (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg("invalid concatenation of jsonb objects"))); + Assert(rk1 == WJB_BEGIN_ARRAY); + Assert(rk2 == WJB_BEGIN_OBJECT); + + pushJsonbValue(state, WJB_BEGIN_ARRAY, NULL); + + while ((r1 = JsonbIteratorNext(it1, &v1, true)) != WJB_END_ARRAY) + pushJsonbValue(state, r1, &v1); + + pushJsonbValue(state, WJB_BEGIN_OBJECT, NULL); + while ((r2 = JsonbIteratorNext(it2, &v2, true)) != WJB_DONE) + pushJsonbValue(state, r2, r2 != WJB_END_OBJECT ? &v2 : NULL); + + res = pushJsonbValue(state, WJB_END_ARRAY, NULL); } return res; diff --git a/src/test/regress/expected/jsonb.out b/src/test/regress/expected/jsonb.out index a70cd0b7c1..1e6c6ef200 100644 --- a/src/test/regress/expected/jsonb.out +++ b/src/test/regress/expected/jsonb.out @@ -4111,9 +4111,41 @@ select '{"a":"b"}'::jsonb || '[]'::jsonb; (1 row) select '"a"'::jsonb || '{"a":1}'; -ERROR: invalid concatenation of jsonb objects + ?column? +----------------- + ["a", {"a": 1}] +(1 row) + select '{"a":1}' || '"a"'::jsonb; -ERROR: invalid concatenation of jsonb objects + ?column? +----------------- + [{"a": 1}, "a"] +(1 row) + +select '[3]'::jsonb || '{}'::jsonb; + ?column? +---------- + [3, {}] +(1 row) + +select '3'::jsonb || '[]'::jsonb; + ?column? +---------- + [3] +(1 row) + +select '3'::jsonb || '4'::jsonb; + ?column? +---------- + [3, 4] +(1 row) + +select '3'::jsonb || '{}'::jsonb; + ?column? +---------- + [3, {}] +(1 row) + select '["a", "b"]'::jsonb || '{"c":1}'; ?column? ---------------------- diff --git a/src/test/regress/sql/jsonb.sql b/src/test/regress/sql/jsonb.sql index 3e2b8f66df..b6409767f6 100644 --- a/src/test/regress/sql/jsonb.sql +++ b/src/test/regress/sql/jsonb.sql @@ -1056,6 +1056,11 @@ select '{"a":"b"}'::jsonb || '[]'::jsonb; select '"a"'::jsonb || '{"a":1}'; select '{"a":1}' || '"a"'::jsonb; +select '[3]'::jsonb || '{}'::jsonb; +select '3'::jsonb || '[]'::jsonb; +select '3'::jsonb || '4'::jsonb; +select '3'::jsonb || '{}'::jsonb; + select '["a", "b"]'::jsonb || '{"c":1}'; select '{"c": 1}'::jsonb || '["a", "b"]';
so 19. 12. 2020 v 21:35 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
I wrote:
> However, further experimentation found a case that fails:
> regression=# select '3'::jsonb || '{}'::jsonb;
> ERROR: invalid concatenation of jsonb objects
> I wonder what is the point of this weird exception, and whether
> whoever devised it can provide a concise explanation of what
> they think the full behavior of "jsonb || jsonb" is. Why isn't
> '[3, {}]' a reasonable result here, if the cases above are OK?
Here is a proposed patch for that. It turns out that the third
else-branch in IteratorConcat() already does the right thing, if
we just remove its restrictive else-condition and let it handle
everything except the two-objects and two-arrays cases. But it
seemed to me that trying to handle both the object || array
and array || object cases in that one else-branch was poorly
thought out: only one line of code can actually be shared, and it
took several extra lines of infrastructure to support the sharing.
So I split those cases into separate else-branches.
This also addresses the inadequate documentation that was the
original complaint.
Thoughts? Should we back-patch this? The existing behavior
seems to me to be inconsistent enough to be arguably a bug,
but we've not had field complaints saying "this should work".
+1
Pavel
regards, tom lane
On Sat, Dec 19, 2020, at 21:35, Tom Lane wrote:
>Here is a proposed patch for that.
I've tested the patch and "All 202 tests passed".
In addition, I've tested it on a json intensive project,
which passes all its own tests.
I haven't studied the jsonfuncs.c code in detail,
but the new code looks much cleaner, nice.
>This also addresses the inadequate documentation that was the
>original complaint.
Looks good.
In addition, to the user wondering how to append a json array-value "as is",
I think it would be useful to provide an example on how to do this
in the documentation.
I think there is a risk users will attempt much more fragile
hacks to achieve this, if we don't provide guidance
in the documentation.
Suggestion:
<literal>'["a", "b"]'::jsonb || '["a", "d"]'::jsonb</literal>
<returnvalue>["a", "b", "a", "d"]</returnvalue>
</para>
+ <para>
+ <literal>'["a", "b"]'::jsonb || jsonb_build_array('["a", "d"]'::jsonb)</literal>
+ <returnvalue>["a", "b", ["a", "d"]]</returnvalue>
+ </para>
<para>
<literal>'{"a": "b"}'::jsonb || '{"c": "d"}'::jsonb</literal>
<returnvalue>{"a": "b", "c": "d"}</returnvalue>
> Thoughts? Should we back-patch this? The existing behavior
> seems to me to be inconsistent enough to be arguably a bug,
> but we've not had field complaints saying "this should work".
+1 back-patch, I think it's a bug.
Best regards,
Joel
Hi,
w.r.t. the patch,
+select '[3]'::jsonb || '{}'::jsonb;
+ ?column?
+----------
+ [3, {}]
+(1 row)
+
+select '3'::jsonb || '[]'::jsonb;
+ ?column?
+----------
+ [3, {}]
+(1 row)
+
+select '3'::jsonb || '[]'::jsonb;
Should cases where the empty array precedes non-empty jsonb be added ?
select '[]'::jsonb || '3'::jsonb;
select '{}'::jsonb || '[3]'::jsonb;
Cheers
"Joel Jacobson" <joel@compiler.org> writes: > On Sat, Dec 19, 2020, at 21:35, Tom Lane wrote: >> Here is a proposed patch for that. > In addition, to the user wondering how to append a json array-value "as is", > I think it would be useful to provide an example on how to do this > in the documentation. Done in v13 and HEAD; the older table format doesn't really have room for more examples. > +1 back-patch, I think it's a bug. I'm not quite sure it's a bug, but it does seem like fairly unhelpful behavior to throw an error instead of doing something useful, so back-patched. regards, tom lane