Making jsonb_agg() faster - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Making jsonb_agg() faster |
Date | |
Msg-id | 1060917.1753202222@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: Making jsonb_agg() faster
|
List | pgsql-hackers |
There have been some complaints recently about how jsonb_agg() is a lot slower than json_agg() [1]. That's annoying considering that the whole selling point of JSONB is to have faster processing than the original JSON type, so I poked into that. What I found is that jsonb_agg() and its variants are just really inefficiently implemented. Basically, for each aggregate input value, they will: 1. Build a JsonbValue tree representation of the input value. 2. Flatten the JsonbValue tree into a Jsonb in on-disk format. 3. Iterate through the Jsonb, building a JsonbValue that is part of the aggregate's state stored in aggcontext, but is otherwise identical to what phase 1 built. The motivation for this seems to have been to make sure that any memory leakage during phase 1 does not happen in the long-lived aggcontext. But it's hard not to call it a Rube Goldberg contraption. The attached patch series gets rid of phases 2 and 3 by refactoring pushJsonbValue() and related functions so that the JsonbValue tree they construct can be constructed in a context that's not CurrentMemoryContext. With that and some run-of-the-mill optimization work, I'm getting 2.5X speedup for jsonb_agg on a text column (as measured by the attached test script) and a bit over 2X on an int8 column. It's still a little slower than json_agg, but no longer slower by integer multiples. 0001 is a somewhat invasive refactoring of the API for pushJsonbValue and friends. It doesn't in itself have any measurable speed consequences as far as I can tell, but I think it makes the code nicer in any case. (I really do not like the existing coding setup where sometimes it's important to capture the result of pushJsonbValue and sometimes it's not; that seems awfully confusing and bug-prone.) The real point though is to have a way of commanding pushJsonbValue to build the JsonbValue tree somewhere other than CurrentMemoryContext. Having laid the groundwork with 0001, 0002 simply amounts to telling pushJsonbValue to put its handiwork in the aggcontext and then ripping out phases 2 and 3 of the aggregate transfns. 0003 is some simple micro-optimization of the datatype conversion code in datum_to_jsonb_internal. Thoughts? regards, tom lane [1] https://www.postgresql.org/message-id/CAHyXU0xQGXFBZ10GtqTkXL3_b8FbB79qP+XS2XCfxp+6WuH1Cg@mail.gmail.com From 4a48c95204812d5cea0e34bbaca03e254a6b98c5 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Mon, 21 Jul 2025 20:38:11 -0400 Subject: [PATCH v1 1/3] Revise APIs for pushJsonbValue() and associated routines. Instead of passing "JsonbParseState **" to pushJsonbValue(), pass a pointer to a JsonbInState, which will contain the parseState stack pointer as well as other useful fields. Also, instead of returning a JsonbValue pointer that is often meaningless/ignored, return the top-level JsonbValue pointer in the "result" field of the JsonbInState. This involves a lot of (mostly mechanical) edits, but I think the results are notationally cleaner and easier to understand. Certainly the business with sometimes capturing the result of pushJsonbValue() and sometimes not was bug-prone and incapable of mechanical verification. In the new arrangement, JsonbInState.result remains null until we've completed a valid sequence of pushes, so that an incorrect sequence will result in a null-pointer dereference, not mistaken use of a partial result. However, this isn't simply an exercise in prettier notation. The real reason for doing it is to provide a mechanism whereby pushJsonbValue() can be told to construct the JsonbValue tree in a context that is not CurrentMemoryContext. That happens when a non-null "outcontext" is specified in the JsonbInState. No callers exercise that option in this patch, but the next patch in the series will make use of it. I tried to improve the comments in this area too. --- contrib/hstore/hstore_io.c | 26 +- contrib/jsonb_plperl/jsonb_plperl.c | 47 ++-- contrib/jsonb_plpython/jsonb_plpython.c | 61 +++-- src/backend/utils/adt/jsonb.c | 165 +++++------- src/backend/utils/adt/jsonb_util.c | 325 ++++++++++++++++-------- src/backend/utils/adt/jsonfuncs.c | 243 ++++++++---------- src/backend/utils/adt/jsonpath_exec.c | 16 +- src/include/utils/jsonb.h | 43 +++- 8 files changed, 515 insertions(+), 411 deletions(-) diff --git a/contrib/hstore/hstore_io.c b/contrib/hstore/hstore_io.c index 4f867e4bd1f..1667c840f30 100644 --- a/contrib/hstore/hstore_io.c +++ b/contrib/hstore/hstore_io.c @@ -1439,10 +1439,9 @@ hstore_to_jsonb(PG_FUNCTION_ARGS) int count = HS_COUNT(in); char *base = STRPTR(in); HEntry *entries = ARRPTR(in); - JsonbParseState *state = NULL; - JsonbValue *res; + JsonbInState state = {0}; - (void) pushJsonbValue(&state, WJB_BEGIN_OBJECT, NULL); + pushJsonbValue(&state, WJB_BEGIN_OBJECT, NULL); for (i = 0; i < count; i++) { @@ -1453,7 +1452,7 @@ hstore_to_jsonb(PG_FUNCTION_ARGS) key.val.string.len = HSTORE_KEYLEN(entries, i); key.val.string.val = HSTORE_KEY(entries, base, i); - (void) pushJsonbValue(&state, WJB_KEY, &key); + pushJsonbValue(&state, WJB_KEY, &key); if (HSTORE_VALISNULL(entries, i)) { @@ -1465,12 +1464,12 @@ hstore_to_jsonb(PG_FUNCTION_ARGS) val.val.string.len = HSTORE_VALLEN(entries, i); val.val.string.val = HSTORE_VAL(entries, base, i); } - (void) pushJsonbValue(&state, WJB_VALUE, &val); + pushJsonbValue(&state, WJB_VALUE, &val); } - res = pushJsonbValue(&state, WJB_END_OBJECT, NULL); + pushJsonbValue(&state, WJB_END_OBJECT, NULL); - PG_RETURN_POINTER(JsonbValueToJsonb(res)); + PG_RETURN_POINTER(JsonbValueToJsonb(state.result)); } PG_FUNCTION_INFO_V1(hstore_to_jsonb_loose); @@ -1482,13 +1481,12 @@ hstore_to_jsonb_loose(PG_FUNCTION_ARGS) int count = HS_COUNT(in); char *base = STRPTR(in); HEntry *entries = ARRPTR(in); - JsonbParseState *state = NULL; - JsonbValue *res; + JsonbInState state = {0}; StringInfoData tmp; initStringInfo(&tmp); - (void) pushJsonbValue(&state, WJB_BEGIN_OBJECT, NULL); + pushJsonbValue(&state, WJB_BEGIN_OBJECT, NULL); for (i = 0; i < count; i++) { @@ -1499,7 +1497,7 @@ hstore_to_jsonb_loose(PG_FUNCTION_ARGS) key.val.string.len = HSTORE_KEYLEN(entries, i); key.val.string.val = HSTORE_KEY(entries, base, i); - (void) pushJsonbValue(&state, WJB_KEY, &key); + pushJsonbValue(&state, WJB_KEY, &key); if (HSTORE_VALISNULL(entries, i)) { @@ -1541,10 +1539,10 @@ hstore_to_jsonb_loose(PG_FUNCTION_ARGS) val.val.string.val = HSTORE_VAL(entries, base, i); } } - (void) pushJsonbValue(&state, WJB_VALUE, &val); + pushJsonbValue(&state, WJB_VALUE, &val); } - res = pushJsonbValue(&state, WJB_END_OBJECT, NULL); + pushJsonbValue(&state, WJB_END_OBJECT, NULL); - PG_RETURN_POINTER(JsonbValueToJsonb(res)); + PG_RETURN_POINTER(JsonbValueToJsonb(state.result)); } diff --git a/contrib/jsonb_plperl/jsonb_plperl.c b/contrib/jsonb_plperl/jsonb_plperl.c index c02e2d41af1..08d4991e27e 100644 --- a/contrib/jsonb_plperl/jsonb_plperl.c +++ b/contrib/jsonb_plperl/jsonb_plperl.c @@ -13,7 +13,7 @@ PG_MODULE_MAGIC_EXT( ); static SV *Jsonb_to_SV(JsonbContainer *jsonb); -static JsonbValue *SV_to_JsonbValue(SV *obj, JsonbParseState **ps, bool is_elem); +static void SV_to_JsonbValue(SV *obj, JsonbInState *ps, bool is_elem); static SV * @@ -127,8 +127,8 @@ Jsonb_to_SV(JsonbContainer *jsonb) } } -static JsonbValue * -AV_to_JsonbValue(AV *in, JsonbParseState **jsonb_state) +static void +AV_to_JsonbValue(AV *in, JsonbInState *jsonb_state) { dTHX; SSize_t pcount = av_len(in) + 1; @@ -141,14 +141,14 @@ AV_to_JsonbValue(AV *in, JsonbParseState **jsonb_state) SV **value = av_fetch(in, i, FALSE); if (value) - (void) SV_to_JsonbValue(*value, jsonb_state, true); + SV_to_JsonbValue(*value, jsonb_state, true); } - return pushJsonbValue(jsonb_state, WJB_END_ARRAY, NULL); + pushJsonbValue(jsonb_state, WJB_END_ARRAY, NULL); } -static JsonbValue * -HV_to_JsonbValue(HV *obj, JsonbParseState **jsonb_state) +static void +HV_to_JsonbValue(HV *obj, JsonbInState *jsonb_state) { dTHX; JsonbValue key; @@ -167,14 +167,14 @@ HV_to_JsonbValue(HV *obj, JsonbParseState **jsonb_state) key.val.string.val = pnstrdup(kstr, klen); key.val.string.len = klen; pushJsonbValue(jsonb_state, WJB_KEY, &key); - (void) SV_to_JsonbValue(val, jsonb_state, false); + SV_to_JsonbValue(val, jsonb_state, false); } - return pushJsonbValue(jsonb_state, WJB_END_OBJECT, NULL); + pushJsonbValue(jsonb_state, WJB_END_OBJECT, NULL); } -static JsonbValue * -SV_to_JsonbValue(SV *in, JsonbParseState **jsonb_state, bool is_elem) +static void +SV_to_JsonbValue(SV *in, JsonbInState *jsonb_state, bool is_elem) { dTHX; JsonbValue out; /* result */ @@ -186,10 +186,12 @@ SV_to_JsonbValue(SV *in, JsonbParseState **jsonb_state, bool is_elem) switch (SvTYPE(in)) { case SVt_PVAV: - return AV_to_JsonbValue((AV *) in, jsonb_state); + AV_to_JsonbValue((AV *) in, jsonb_state); + return; case SVt_PVHV: - return HV_to_JsonbValue((HV *) in, jsonb_state); + HV_to_JsonbValue((HV *) in, jsonb_state); + return; default: if (!SvOK(in)) @@ -259,14 +261,18 @@ SV_to_JsonbValue(SV *in, JsonbParseState **jsonb_state, bool is_elem) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot transform this Perl type to jsonb"))); - return NULL; } } /* Push result into 'jsonb_state' unless it is a raw scalar. */ - return *jsonb_state - ? pushJsonbValue(jsonb_state, is_elem ? WJB_ELEM : WJB_VALUE, &out) - : memcpy(palloc(sizeof(JsonbValue)), &out, sizeof(JsonbValue)); + if (jsonb_state->parseState) + pushJsonbValue(jsonb_state, is_elem ? WJB_ELEM : WJB_VALUE, &out); + else + { + /* XXX Ugly hack */ + jsonb_state->result = palloc(sizeof(JsonbValue)); + memcpy(jsonb_state->result, &out, sizeof(JsonbValue)); + } } @@ -289,10 +295,9 @@ Datum plperl_to_jsonb(PG_FUNCTION_ARGS) { dTHX; - JsonbParseState *jsonb_state = NULL; SV *in = (SV *) PG_GETARG_POINTER(0); - JsonbValue *out = SV_to_JsonbValue(in, &jsonb_state, true); - Jsonb *result = JsonbValueToJsonb(out); + JsonbInState jsonb_state = {0}; - PG_RETURN_JSONB_P(result); + SV_to_JsonbValue(in, &jsonb_state, true); + PG_RETURN_JSONB_P(JsonbValueToJsonb(jsonb_state.result)); } diff --git a/contrib/jsonb_plpython/jsonb_plpython.c b/contrib/jsonb_plpython/jsonb_plpython.c index 9383615abbf..c16b7405dc7 100644 --- a/contrib/jsonb_plpython/jsonb_plpython.c +++ b/contrib/jsonb_plpython/jsonb_plpython.c @@ -26,8 +26,8 @@ static PLy_elog_impl_t PLy_elog_impl_p; static PyObject *decimal_constructor; static PyObject *PLyObject_FromJsonbContainer(JsonbContainer *jsonb); -static JsonbValue *PLyObject_ToJsonbValue(PyObject *obj, - JsonbParseState **jsonb_state, bool is_elem); +static void PLyObject_ToJsonbValue(PyObject *obj, + JsonbInState *jsonb_state, bool is_elem); typedef PyObject *(*PLyUnicode_FromStringAndSize_t) (const char *s, Py_ssize_t size); @@ -261,12 +261,11 @@ PLyObject_FromJsonbContainer(JsonbContainer *jsonb) * * Transform Python dict to JsonbValue. */ -static JsonbValue * -PLyMapping_ToJsonbValue(PyObject *obj, JsonbParseState **jsonb_state) +static void +PLyMapping_ToJsonbValue(PyObject *obj, JsonbInState *jsonb_state) { Py_ssize_t pcount; PyObject *volatile items; - JsonbValue *volatile out; pcount = PyMapping_Size(obj); items = PyMapping_Items(obj); @@ -297,19 +296,17 @@ PLyMapping_ToJsonbValue(PyObject *obj, JsonbParseState **jsonb_state) PLyUnicode_ToJsonbValue(key, &jbvKey); } - (void) pushJsonbValue(jsonb_state, WJB_KEY, &jbvKey); - (void) PLyObject_ToJsonbValue(value, jsonb_state, false); + pushJsonbValue(jsonb_state, WJB_KEY, &jbvKey); + PLyObject_ToJsonbValue(value, jsonb_state, false); } - out = pushJsonbValue(jsonb_state, WJB_END_OBJECT, NULL); + pushJsonbValue(jsonb_state, WJB_END_OBJECT, NULL); } PG_FINALLY(); { Py_DECREF(items); } PG_END_TRY(); - - return out; } /* @@ -318,8 +315,8 @@ PLyMapping_ToJsonbValue(PyObject *obj, JsonbParseState **jsonb_state) * Transform python list to JsonbValue. Expects transformed PyObject and * a state required for jsonb construction. */ -static JsonbValue * -PLySequence_ToJsonbValue(PyObject *obj, JsonbParseState **jsonb_state) +static void +PLySequence_ToJsonbValue(PyObject *obj, JsonbInState *jsonb_state) { Py_ssize_t i; Py_ssize_t pcount; @@ -336,7 +333,7 @@ PLySequence_ToJsonbValue(PyObject *obj, JsonbParseState **jsonb_state) value = PySequence_GetItem(obj, i); Assert(value); - (void) PLyObject_ToJsonbValue(value, jsonb_state, true); + PLyObject_ToJsonbValue(value, jsonb_state, true); Py_XDECREF(value); value = NULL; } @@ -348,7 +345,7 @@ PLySequence_ToJsonbValue(PyObject *obj, JsonbParseState **jsonb_state) } PG_END_TRY(); - return pushJsonbValue(jsonb_state, WJB_END_ARRAY, NULL); + pushJsonbValue(jsonb_state, WJB_END_ARRAY, NULL); } /* @@ -406,17 +403,23 @@ PLyNumber_ToJsonbValue(PyObject *obj, JsonbValue *jbvNum) * * Transform python object to JsonbValue. */ -static JsonbValue * -PLyObject_ToJsonbValue(PyObject *obj, JsonbParseState **jsonb_state, bool is_elem) +static void +PLyObject_ToJsonbValue(PyObject *obj, JsonbInState *jsonb_state, bool is_elem) { JsonbValue *out; if (!PyUnicode_Check(obj)) { if (PySequence_Check(obj)) - return PLySequence_ToJsonbValue(obj, jsonb_state); + { + PLySequence_ToJsonbValue(obj, jsonb_state); + return; + } else if (PyMapping_Check(obj)) - return PLyMapping_ToJsonbValue(obj, jsonb_state); + { + PLyMapping_ToJsonbValue(obj, jsonb_state); + return; + } } out = palloc(sizeof(JsonbValue)); @@ -443,10 +446,14 @@ PLyObject_ToJsonbValue(PyObject *obj, JsonbParseState **jsonb_state, bool is_ele errmsg("Python type \"%s\" cannot be transformed to jsonb", PLyObject_AsString((PyObject *) obj->ob_type)))); - /* Push result into 'jsonb_state' unless it is raw scalar value. */ - return (*jsonb_state ? - pushJsonbValue(jsonb_state, is_elem ? WJB_ELEM : WJB_VALUE, out) : - out); + /* Push result into 'jsonb_state' unless it is a raw scalar. */ + if (jsonb_state->parseState) + pushJsonbValue(jsonb_state, is_elem ? WJB_ELEM : WJB_VALUE, out); + else + { + /* XXX Ugly hack */ + jsonb_state->result = out; + } } /* @@ -458,13 +465,11 @@ PG_FUNCTION_INFO_V1(plpython_to_jsonb); Datum plpython_to_jsonb(PG_FUNCTION_ARGS) { - PyObject *obj; - JsonbValue *out; - JsonbParseState *jsonb_state = NULL; + PyObject *obj = (PyObject *) PG_GETARG_POINTER(0); + JsonbInState jsonb_state = {0}; - obj = (PyObject *) PG_GETARG_POINTER(0); - out = PLyObject_ToJsonbValue(obj, &jsonb_state, true); - PG_RETURN_POINTER(JsonbValueToJsonb(out)); + PLyObject_ToJsonbValue(obj, &jsonb_state, true); + PG_RETURN_POINTER(JsonbValueToJsonb(jsonb_state.result)); } /* diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c index da94d424d61..d15d184ef8b 100644 --- a/src/backend/utils/adt/jsonb.c +++ b/src/backend/utils/adt/jsonb.c @@ -25,14 +25,6 @@ #include "utils/lsyscache.h" #include "utils/typcache.h" -typedef struct JsonbInState -{ - JsonbParseState *parseState; - JsonbValue *res; - bool unique_keys; - Node *escontext; -} JsonbInState; - typedef struct JsonbAggState { JsonbInState *res; @@ -269,8 +261,8 @@ jsonb_from_cstring(char *json, int len, bool unique_keys, Node *escontext) if (!pg_parse_json_or_errsave(&lex, &sem, escontext)) return (Datum) 0; - /* after parsing, the item member has the composed jsonb structure */ - PG_RETURN_POINTER(JsonbValueToJsonb(state.res)); + /* after parsing, the result field has the composed jsonb structure */ + PG_RETURN_POINTER(JsonbValueToJsonb(state.result)); } static bool @@ -291,7 +283,7 @@ jsonb_in_object_start(void *pstate) { JsonbInState *_state = (JsonbInState *) pstate; - _state->res = pushJsonbValue(&_state->parseState, WJB_BEGIN_OBJECT, NULL); + pushJsonbValue(_state, WJB_BEGIN_OBJECT, NULL); _state->parseState->unique_keys = _state->unique_keys; return JSON_SUCCESS; @@ -302,7 +294,7 @@ jsonb_in_object_end(void *pstate) { JsonbInState *_state = (JsonbInState *) pstate; - _state->res = pushJsonbValue(&_state->parseState, WJB_END_OBJECT, NULL); + pushJsonbValue(_state, WJB_END_OBJECT, NULL); return JSON_SUCCESS; } @@ -312,7 +304,7 @@ jsonb_in_array_start(void *pstate) { JsonbInState *_state = (JsonbInState *) pstate; - _state->res = pushJsonbValue(&_state->parseState, WJB_BEGIN_ARRAY, NULL); + pushJsonbValue(_state, WJB_BEGIN_ARRAY, NULL); return JSON_SUCCESS; } @@ -322,7 +314,7 @@ jsonb_in_array_end(void *pstate) { JsonbInState *_state = (JsonbInState *) pstate; - _state->res = pushJsonbValue(&_state->parseState, WJB_END_ARRAY, NULL); + pushJsonbValue(_state, WJB_END_ARRAY, NULL); return JSON_SUCCESS; } @@ -340,7 +332,7 @@ jsonb_in_object_field_start(void *pstate, char *fname, bool isnull) return JSON_SEM_ACTION_FAILED; v.val.string.val = fname; - _state->res = pushJsonbValue(&_state->parseState, WJB_KEY, &v); + pushJsonbValue(_state, WJB_KEY, &v); return JSON_SUCCESS; } @@ -434,9 +426,9 @@ jsonb_in_scalar(void *pstate, char *token, JsonTokenType tokentype) va.val.array.rawScalar = true; va.val.array.nElems = 1; - _state->res = pushJsonbValue(&_state->parseState, WJB_BEGIN_ARRAY, &va); - _state->res = pushJsonbValue(&_state->parseState, WJB_ELEM, &v); - _state->res = pushJsonbValue(&_state->parseState, WJB_END_ARRAY, NULL); + pushJsonbValue(_state, WJB_BEGIN_ARRAY, &va); + pushJsonbValue(_state, WJB_ELEM, &v); + pushJsonbValue(_state, WJB_END_ARRAY, NULL); } else { @@ -445,10 +437,10 @@ jsonb_in_scalar(void *pstate, char *token, JsonTokenType tokentype) switch (o->type) { case jbvArray: - _state->res = pushJsonbValue(&_state->parseState, WJB_ELEM, &v); + pushJsonbValue(_state, WJB_ELEM, &v); break; case jbvObject: - _state->res = pushJsonbValue(&_state->parseState, WJB_VALUE, &v); + pushJsonbValue(_state, WJB_VALUE, &v); break; default: elog(ERROR, "unexpected parent of nested structure"); @@ -794,11 +786,9 @@ datum_to_jsonb_internal(Datum val, bool is_null, JsonbInState *result, { if (type == WJB_END_ARRAY || type == WJB_END_OBJECT || type == WJB_BEGIN_ARRAY || type == WJB_BEGIN_OBJECT) - result->res = pushJsonbValue(&result->parseState, - type, NULL); + pushJsonbValue(result, type, NULL); else - result->res = pushJsonbValue(&result->parseState, - type, &jb); + pushJsonbValue(result, type, &jb); } } } @@ -829,9 +819,9 @@ datum_to_jsonb_internal(Datum val, bool is_null, JsonbInState *result, va.val.array.rawScalar = true; va.val.array.nElems = 1; - result->res = pushJsonbValue(&result->parseState, WJB_BEGIN_ARRAY, &va); - result->res = pushJsonbValue(&result->parseState, WJB_ELEM, &jb); - result->res = pushJsonbValue(&result->parseState, WJB_END_ARRAY, NULL); + pushJsonbValue(result, WJB_BEGIN_ARRAY, &va); + pushJsonbValue(result, WJB_ELEM, &jb); + pushJsonbValue(result, WJB_END_ARRAY, NULL); } else { @@ -840,12 +830,12 @@ datum_to_jsonb_internal(Datum val, bool is_null, JsonbInState *result, switch (o->type) { case jbvArray: - result->res = pushJsonbValue(&result->parseState, WJB_ELEM, &jb); + pushJsonbValue(result, WJB_ELEM, &jb); break; case jbvObject: - result->res = pushJsonbValue(&result->parseState, - key_scalar ? WJB_KEY : WJB_VALUE, - &jb); + pushJsonbValue(result, + key_scalar ? WJB_KEY : WJB_VALUE, + &jb); break; default: elog(ERROR, "unexpected parent of nested structure"); @@ -867,7 +857,7 @@ array_dim_to_jsonb(JsonbInState *result, int dim, int ndims, int *dims, const Da Assert(dim < ndims); - result->res = pushJsonbValue(&result->parseState, WJB_BEGIN_ARRAY, NULL); + pushJsonbValue(result, WJB_BEGIN_ARRAY, NULL); for (i = 1; i <= dims[dim]; i++) { @@ -884,7 +874,7 @@ array_dim_to_jsonb(JsonbInState *result, int dim, int ndims, int *dims, const Da } } - result->res = pushJsonbValue(&result->parseState, WJB_END_ARRAY, NULL); + pushJsonbValue(result, WJB_END_ARRAY, NULL); } /* @@ -913,8 +903,8 @@ array_to_jsonb_internal(Datum array, JsonbInState *result) if (nitems <= 0) { - result->res = pushJsonbValue(&result->parseState, WJB_BEGIN_ARRAY, NULL); - result->res = pushJsonbValue(&result->parseState, WJB_END_ARRAY, NULL); + pushJsonbValue(result, WJB_BEGIN_ARRAY, NULL); + pushJsonbValue(result, WJB_END_ARRAY, NULL); return; } @@ -961,7 +951,7 @@ composite_to_jsonb(Datum composite, JsonbInState *result) tmptup.t_data = td; tuple = &tmptup; - result->res = pushJsonbValue(&result->parseState, WJB_BEGIN_OBJECT, NULL); + pushJsonbValue(result, WJB_BEGIN_OBJECT, NULL); for (i = 0; i < tupdesc->natts; i++) { @@ -983,7 +973,7 @@ composite_to_jsonb(Datum composite, JsonbInState *result) v.val.string.len = strlen(attname); v.val.string.val = attname; - result->res = pushJsonbValue(&result->parseState, WJB_KEY, &v); + pushJsonbValue(result, WJB_KEY, &v); val = heap_getattr(tuple, i + 1, tupdesc, &isnull); @@ -1000,7 +990,7 @@ composite_to_jsonb(Datum composite, JsonbInState *result) false); } - result->res = pushJsonbValue(&result->parseState, WJB_END_OBJECT, NULL); + pushJsonbValue(result, WJB_END_OBJECT, NULL); ReleaseTupleDesc(tupdesc); } @@ -1118,7 +1108,7 @@ datum_to_jsonb(Datum val, JsonTypeCategory tcategory, Oid outfuncoid) datum_to_jsonb_internal(val, false, &result, tcategory, outfuncoid, false); - return JsonbPGetDatum(JsonbValueToJsonb(result.res)); + return JsonbPGetDatum(JsonbValueToJsonb(result.result)); } Datum @@ -1138,7 +1128,7 @@ jsonb_build_object_worker(int nargs, const Datum *args, const bool *nulls, const memset(&result, 0, sizeof(JsonbInState)); - result.res = pushJsonbValue(&result.parseState, WJB_BEGIN_OBJECT, NULL); + pushJsonbValue(&result, WJB_BEGIN_OBJECT, NULL); result.parseState->unique_keys = unique_keys; result.parseState->skip_nulls = absent_on_null; @@ -1165,9 +1155,9 @@ jsonb_build_object_worker(int nargs, const Datum *args, const bool *nulls, const add_jsonb(args[i + 1], nulls[i + 1], &result, types[i + 1], false); } - result.res = pushJsonbValue(&result.parseState, WJB_END_OBJECT, NULL); + pushJsonbValue(&result, WJB_END_OBJECT, NULL); - return JsonbPGetDatum(JsonbValueToJsonb(result.res)); + return JsonbPGetDatum(JsonbValueToJsonb(result.result)); } /* @@ -1200,10 +1190,10 @@ jsonb_build_object_noargs(PG_FUNCTION_ARGS) memset(&result, 0, sizeof(JsonbInState)); - (void) pushJsonbValue(&result.parseState, WJB_BEGIN_OBJECT, NULL); - result.res = pushJsonbValue(&result.parseState, WJB_END_OBJECT, NULL); + pushJsonbValue(&result, WJB_BEGIN_OBJECT, NULL); + pushJsonbValue(&result, WJB_END_OBJECT, NULL); - PG_RETURN_POINTER(JsonbValueToJsonb(result.res)); + PG_RETURN_POINTER(JsonbValueToJsonb(result.result)); } Datum @@ -1215,7 +1205,7 @@ jsonb_build_array_worker(int nargs, const Datum *args, const bool *nulls, const memset(&result, 0, sizeof(JsonbInState)); - result.res = pushJsonbValue(&result.parseState, WJB_BEGIN_ARRAY, NULL); + pushJsonbValue(&result, WJB_BEGIN_ARRAY, NULL); for (i = 0; i < nargs; i++) { @@ -1225,9 +1215,9 @@ jsonb_build_array_worker(int nargs, const Datum *args, const bool *nulls, const add_jsonb(args[i], nulls[i], &result, types[i], false); } - result.res = pushJsonbValue(&result.parseState, WJB_END_ARRAY, NULL); + pushJsonbValue(&result, WJB_END_ARRAY, NULL); - return JsonbPGetDatum(JsonbValueToJsonb(result.res)); + return JsonbPGetDatum(JsonbValueToJsonb(result.result)); } /* @@ -1261,10 +1251,10 @@ jsonb_build_array_noargs(PG_FUNCTION_ARGS) memset(&result, 0, sizeof(JsonbInState)); - (void) pushJsonbValue(&result.parseState, WJB_BEGIN_ARRAY, NULL); - result.res = pushJsonbValue(&result.parseState, WJB_END_ARRAY, NULL); + pushJsonbValue(&result, WJB_BEGIN_ARRAY, NULL); + pushJsonbValue(&result, WJB_END_ARRAY, NULL); - PG_RETURN_POINTER(JsonbValueToJsonb(result.res)); + PG_RETURN_POINTER(JsonbValueToJsonb(result.result)); } @@ -1289,7 +1279,7 @@ jsonb_object(PG_FUNCTION_ARGS) memset(&result, 0, sizeof(JsonbInState)); - (void) pushJsonbValue(&result.parseState, WJB_BEGIN_OBJECT, NULL); + pushJsonbValue(&result, WJB_BEGIN_OBJECT, NULL); switch (ndims) { @@ -1340,7 +1330,7 @@ jsonb_object(PG_FUNCTION_ARGS) v.val.string.len = len; v.val.string.val = str; - (void) pushJsonbValue(&result.parseState, WJB_KEY, &v); + pushJsonbValue(&result, WJB_KEY, &v); if (in_nulls[i * 2 + 1]) { @@ -1357,16 +1347,16 @@ jsonb_object(PG_FUNCTION_ARGS) v.val.string.val = str; } - (void) pushJsonbValue(&result.parseState, WJB_VALUE, &v); + pushJsonbValue(&result, WJB_VALUE, &v); } pfree(in_datums); pfree(in_nulls); close_object: - result.res = pushJsonbValue(&result.parseState, WJB_END_OBJECT, NULL); + pushJsonbValue(&result, WJB_END_OBJECT, NULL); - PG_RETURN_POINTER(JsonbValueToJsonb(result.res)); + PG_RETURN_POINTER(JsonbValueToJsonb(result.result)); } /* @@ -1393,7 +1383,7 @@ jsonb_object_two_arg(PG_FUNCTION_ARGS) memset(&result, 0, sizeof(JsonbInState)); - (void) pushJsonbValue(&result.parseState, WJB_BEGIN_OBJECT, NULL); + pushJsonbValue(&result, WJB_BEGIN_OBJECT, NULL); if (nkdims > 1 || nkdims != nvdims) ereport(ERROR, @@ -1430,7 +1420,7 @@ jsonb_object_two_arg(PG_FUNCTION_ARGS) v.val.string.len = len; v.val.string.val = str; - (void) pushJsonbValue(&result.parseState, WJB_KEY, &v); + pushJsonbValue(&result, WJB_KEY, &v); if (val_nulls[i]) { @@ -1447,7 +1437,7 @@ jsonb_object_two_arg(PG_FUNCTION_ARGS) v.val.string.val = str; } - (void) pushJsonbValue(&result.parseState, WJB_VALUE, &v); + pushJsonbValue(&result, WJB_VALUE, &v); } pfree(key_datums); @@ -1456,9 +1446,9 @@ jsonb_object_two_arg(PG_FUNCTION_ARGS) pfree(val_nulls); close_object: - result.res = pushJsonbValue(&result.parseState, WJB_END_OBJECT, NULL); + pushJsonbValue(&result, WJB_END_OBJECT, NULL); - PG_RETURN_POINTER(JsonbValueToJsonb(result.res)); + PG_RETURN_POINTER(JsonbValueToJsonb(result.result)); } @@ -1533,8 +1523,7 @@ jsonb_agg_transfn_worker(FunctionCallInfo fcinfo, bool absent_on_null) state = palloc(sizeof(JsonbAggState)); result = palloc0(sizeof(JsonbInState)); state->res = result; - result->res = pushJsonbValue(&result->parseState, - WJB_BEGIN_ARRAY, NULL); + pushJsonbValue(result, WJB_BEGIN_ARRAY, NULL); MemoryContextSwitchTo(oldcontext); json_categorize_type(arg_type, true, &state->val_category, @@ -1558,7 +1547,7 @@ jsonb_agg_transfn_worker(FunctionCallInfo fcinfo, bool absent_on_null) datum_to_jsonb_internal(val, PG_ARGISNULL(1), &elem, state->val_category, state->val_output_func, false); - jbelem = JsonbValueToJsonb(elem.res); + jbelem = JsonbValueToJsonb(elem.result); /* switch to the aggregate context for accumulation operations */ @@ -1574,18 +1563,15 @@ jsonb_agg_transfn_worker(FunctionCallInfo fcinfo, bool absent_on_null) if (v.val.array.rawScalar) single_scalar = true; else - result->res = pushJsonbValue(&result->parseState, - type, NULL); + pushJsonbValue(result, type, NULL); break; case WJB_END_ARRAY: if (!single_scalar) - result->res = pushJsonbValue(&result->parseState, - type, NULL); + pushJsonbValue(result, type, NULL); break; case WJB_BEGIN_OBJECT: case WJB_END_OBJECT: - result->res = pushJsonbValue(&result->parseState, - type, NULL); + pushJsonbValue(result, type, NULL); break; case WJB_ELEM: case WJB_KEY: @@ -1605,8 +1591,7 @@ jsonb_agg_transfn_worker(FunctionCallInfo fcinfo, bool absent_on_null) DatumGetNumeric(DirectFunctionCall1(numeric_uplus, NumericGetDatum(v.val.numeric))); } - result->res = pushJsonbValue(&result->parseState, - type, &v); + pushJsonbValue(result, type, &v); break; default: elog(ERROR, "unknown jsonb iterator token type"); @@ -1661,10 +1646,9 @@ jsonb_agg_finalfn(PG_FUNCTION_ARGS) result.parseState = clone_parse_state(arg->res->parseState); - result.res = pushJsonbValue(&result.parseState, - WJB_END_ARRAY, NULL); + pushJsonbValue(&result, WJB_END_ARRAY, NULL); - out = JsonbValueToJsonb(result.res); + out = JsonbValueToJsonb(result.result); PG_RETURN_POINTER(out); } @@ -1703,8 +1687,7 @@ jsonb_object_agg_transfn_worker(FunctionCallInfo fcinfo, state = palloc(sizeof(JsonbAggState)); result = palloc0(sizeof(JsonbInState)); state->res = result; - result->res = pushJsonbValue(&result->parseState, - WJB_BEGIN_OBJECT, NULL); + pushJsonbValue(result, WJB_BEGIN_OBJECT, NULL); result->parseState->unique_keys = unique_keys; result->parseState->skip_nulls = absent_on_null; @@ -1759,7 +1742,7 @@ jsonb_object_agg_transfn_worker(FunctionCallInfo fcinfo, datum_to_jsonb_internal(val, false, &elem, state->key_category, state->key_output_func, true); - jbkey = JsonbValueToJsonb(elem.res); + jbkey = JsonbValueToJsonb(elem.result); val = PG_ARGISNULL(2) ? (Datum) 0 : PG_GETARG_DATUM(2); @@ -1768,7 +1751,7 @@ jsonb_object_agg_transfn_worker(FunctionCallInfo fcinfo, datum_to_jsonb_internal(val, PG_ARGISNULL(2), &elem, state->val_category, state->val_output_func, false); - jbval = JsonbValueToJsonb(elem.res); + jbval = JsonbValueToJsonb(elem.result); it = JsonbIteratorInit(&jbkey->root); @@ -1805,14 +1788,12 @@ jsonb_object_agg_transfn_worker(FunctionCallInfo fcinfo, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("object keys must be strings"))); } - result->res = pushJsonbValue(&result->parseState, - WJB_KEY, &v); + pushJsonbValue(result, WJB_KEY, &v); if (skip) { v.type = jbvNull; - result->res = pushJsonbValue(&result->parseState, - WJB_VALUE, &v); + pushJsonbValue(result, WJB_VALUE, &v); MemoryContextSwitchTo(oldcontext); PG_RETURN_POINTER(state); } @@ -1844,18 +1825,15 @@ jsonb_object_agg_transfn_worker(FunctionCallInfo fcinfo, if (v.val.array.rawScalar) single_scalar = true; else - result->res = pushJsonbValue(&result->parseState, - type, NULL); + pushJsonbValue(result, type, NULL); break; case WJB_END_ARRAY: if (!single_scalar) - result->res = pushJsonbValue(&result->parseState, - type, NULL); + pushJsonbValue(result, type, NULL); break; case WJB_BEGIN_OBJECT: case WJB_END_OBJECT: - result->res = pushJsonbValue(&result->parseState, - type, NULL); + pushJsonbValue(result, type, NULL); break; case WJB_ELEM: case WJB_KEY: @@ -1875,9 +1853,7 @@ jsonb_object_agg_transfn_worker(FunctionCallInfo fcinfo, DatumGetNumeric(DirectFunctionCall1(numeric_uplus, NumericGetDatum(v.val.numeric))); } - result->res = pushJsonbValue(&result->parseState, - single_scalar ? WJB_VALUE : type, - &v); + pushJsonbValue(result, single_scalar ? WJB_VALUE : type, &v); break; default: elog(ERROR, "unknown jsonb iterator token type"); @@ -1952,10 +1928,9 @@ jsonb_object_agg_finalfn(PG_FUNCTION_ARGS) result.parseState = clone_parse_state(arg->res->parseState); - result.res = pushJsonbValue(&result.parseState, - WJB_END_OBJECT, NULL); + pushJsonbValue(&result, WJB_END_OBJECT, NULL); - out = JsonbValueToJsonb(result.res); + out = JsonbValueToJsonb(result.result); PG_RETURN_POINTER(out); } diff --git a/src/backend/utils/adt/jsonb_util.c b/src/backend/utils/adt/jsonb_util.c index 82b807d067a..b6189141b80 100644 --- a/src/backend/utils/adt/jsonb_util.c +++ b/src/backend/utils/adt/jsonb_util.c @@ -14,10 +14,12 @@ #include "postgres.h" #include "catalog/pg_collation.h" +#include "catalog/pg_type.h" #include "common/hashfn.h" #include "miscadmin.h" #include "port/pg_bitutils.h" #include "utils/datetime.h" +#include "utils/datum.h" #include "utils/fmgrprotos.h" #include "utils/json.h" #include "utils/jsonb.h" @@ -54,19 +56,20 @@ static short padBufferToInt(StringInfo buffer); static JsonbIterator *iteratorFromContainer(JsonbContainer *container, JsonbIterator *parent); static JsonbIterator *freeAndGetParent(JsonbIterator *it); -static JsonbParseState *pushState(JsonbParseState **pstate); -static void appendKey(JsonbParseState *pstate, JsonbValue *string); -static void appendValue(JsonbParseState *pstate, JsonbValue *scalarVal); -static void appendElement(JsonbParseState *pstate, JsonbValue *scalarVal); +static JsonbParseState *pushState(JsonbInState *pstate); +static void appendKey(JsonbInState *pstate, JsonbValue *string, bool needCopy); +static void appendValue(JsonbInState *pstate, JsonbValue *scalarVal, bool needCopy); +static void appendElement(JsonbInState *pstate, JsonbValue *scalarVal, bool needCopy); +static void copyScalarSubstructure(JsonbValue *v, MemoryContext outcontext); static int lengthCompareJsonbStringValue(const void *a, const void *b); static int lengthCompareJsonbString(const char *val1, int len1, const char *val2, int len2); static int lengthCompareJsonbPair(const void *a, const void *b, void *binequal); static void uniqueifyJsonbObject(JsonbValue *object, bool unique_keys, bool skip_nulls); -static JsonbValue *pushJsonbValueScalar(JsonbParseState **pstate, - JsonbIteratorToken seq, - JsonbValue *scalarVal); +static void pushJsonbValueScalar(JsonbInState *pstate, + JsonbIteratorToken seq, + JsonbValue *scalarVal); void JsonbToJsonbValue(Jsonb *jsonb, JsonbValue *val) @@ -95,9 +98,8 @@ JsonbValueToJsonb(JsonbValue *val) if (IsAJsonbScalar(val)) { - /* Scalar value */ - JsonbParseState *pstate = NULL; - JsonbValue *res; + /* Scalar value, so wrap it in an array */ + JsonbInState pstate = {0}; JsonbValue scalarArray; scalarArray.type = jbvArray; @@ -106,9 +108,9 @@ JsonbValueToJsonb(JsonbValue *val) pushJsonbValue(&pstate, WJB_BEGIN_ARRAY, &scalarArray); pushJsonbValue(&pstate, WJB_ELEM, val); - res = pushJsonbValue(&pstate, WJB_END_ARRAY, NULL); + pushJsonbValue(&pstate, WJB_END_ARRAY, NULL); - out = convertToJsonb(res); + out = convertToJsonb(pstate.result); } else if (val->type == jbvObject || val->type == jbvArray) { @@ -547,13 +549,23 @@ fillJsonbValue(JsonbContainer *container, int index, } /* - * Push JsonbValue into JsonbParseState. + * Push JsonbValue into JsonbInState. * - * Used when parsing JSON tokens to form Jsonb, or when converting an in-memory - * JsonbValue to a Jsonb. + * Used, for example, when parsing JSON input. * - * Initial state of *JsonbParseState is NULL, since it'll be allocated here - * originally (caller will get JsonbParseState back by reference). + * *pstate is typically initialized to all-zeroes, except that the caller + * may provide outcontext and/or escontext. (escontext is ignored by this + * function and its subroutines, however.) + * + * "seq" tells what is being pushed (start/end of array or object, key, + * value, etc). WJB_DONE is not used here, but the other values of + * JsonbIteratorToken are. We assume the caller passes a valid sequence + * of values. + * + * The passed "jbval" is typically transient storage, such as a local variable. + * We will copy it into the outcontext (CurrentMemoryContext by default). + * If outcontext isn't NULL, we will also make copies of any pass-by-reference + * scalar values. * * Only sequential tokens pertaining to non-container types should pass a * JsonbValue. There is one exception -- WJB_BEGIN_ARRAY callers may pass a @@ -562,18 +574,31 @@ fillJsonbValue(JsonbContainer *container, int index, * * Values of type jbvBinary, which are rolled up arrays and objects, * are unpacked before being added to the result. + * + * At the end of construction of a JsonbValue, pstate->result will reference + * the top-level JsonbValue object. */ -JsonbValue * -pushJsonbValue(JsonbParseState **pstate, JsonbIteratorToken seq, +void +pushJsonbValue(JsonbInState *pstate, JsonbIteratorToken seq, JsonbValue *jbval) { JsonbIterator *it; - JsonbValue *res = NULL; JsonbValue v; JsonbIteratorToken tok; int i; - if (jbval && (seq == WJB_ELEM || seq == WJB_VALUE) && jbval->type == jbvObject) + /* + * pushJsonbValueScalar handles all cases not involving pushing a + * container object as an ELEM or VALUE. + */ + if (!jbval || (seq != WJB_ELEM && seq != WJB_VALUE)) + { + pushJsonbValueScalar(pstate, seq, jbval); + return; + } + + /* If an object or array is pushed, recursively push its contents */ + if (jbval->type == jbvObject) { pushJsonbValue(pstate, WJB_BEGIN_OBJECT, NULL); for (i = 0; i < jbval->val.object.nPairs; i++) @@ -581,32 +606,34 @@ pushJsonbValue(JsonbParseState **pstate, JsonbIteratorToken seq, pushJsonbValue(pstate, WJB_KEY, &jbval->val.object.pairs[i].key); pushJsonbValue(pstate, WJB_VALUE, &jbval->val.object.pairs[i].value); } - - return pushJsonbValue(pstate, WJB_END_OBJECT, NULL); + pushJsonbValue(pstate, WJB_END_OBJECT, NULL); + return; } - if (jbval && (seq == WJB_ELEM || seq == WJB_VALUE) && jbval->type == jbvArray) + if (jbval->type == jbvArray) { pushJsonbValue(pstate, WJB_BEGIN_ARRAY, NULL); for (i = 0; i < jbval->val.array.nElems; i++) { pushJsonbValue(pstate, WJB_ELEM, &jbval->val.array.elems[i]); } - - return pushJsonbValue(pstate, WJB_END_ARRAY, NULL); + pushJsonbValue(pstate, WJB_END_ARRAY, NULL); + return; } - if (!jbval || (seq != WJB_ELEM && seq != WJB_VALUE) || - jbval->type != jbvBinary) + /* If it's not a jbvBinary value, again it goes to pushJsonbValueScalar */ + if (jbval->type != jbvBinary) { - /* drop through */ - return pushJsonbValueScalar(pstate, seq, jbval); + pushJsonbValueScalar(pstate, seq, jbval); + return; } - /* unpack the binary and add each piece to the pstate */ + /* Iterate through the jbvBinary value and push its contents */ it = JsonbIteratorInit(jbval->val.binary.data); - if ((jbval->val.binary.data->header & JB_FSCALAR) && *pstate) + /* ... with a special case for pushing a raw scalar */ + if ((jbval->val.binary.data->header & JB_FSCALAR) && + pstate->parseState != NULL) { tok = JsonbIteratorNext(&it, &v, true); Assert(tok == WJB_BEGIN_ARRAY); @@ -615,141 +642,160 @@ pushJsonbValue(JsonbParseState **pstate, JsonbIteratorToken seq, tok = JsonbIteratorNext(&it, &v, true); Assert(tok == WJB_ELEM); - res = pushJsonbValueScalar(pstate, seq, &v); + pushJsonbValueScalar(pstate, seq, &v); tok = JsonbIteratorNext(&it, &v, true); Assert(tok == WJB_END_ARRAY); Assert(it == NULL); - return res; + return; } while ((tok = JsonbIteratorNext(&it, &v, false)) != WJB_DONE) - res = pushJsonbValueScalar(pstate, tok, - tok < WJB_BEGIN_ARRAY || - (tok == WJB_BEGIN_ARRAY && - v.val.array.rawScalar) ? &v : NULL); - - return res; + pushJsonbValueScalar(pstate, tok, + tok < WJB_BEGIN_ARRAY || + (tok == WJB_BEGIN_ARRAY && + v.val.array.rawScalar) ? &v : NULL); } /* * Do the actual pushing, with only scalar or pseudo-scalar-array values * accepted. */ -static JsonbValue * -pushJsonbValueScalar(JsonbParseState **pstate, JsonbIteratorToken seq, +static void +pushJsonbValueScalar(JsonbInState *pstate, JsonbIteratorToken seq, JsonbValue *scalarVal) { - JsonbValue *result = NULL; + JsonbParseState *ppstate; + JsonbValue *val; + MemoryContext outcontext; switch (seq) { case WJB_BEGIN_ARRAY: Assert(!scalarVal || scalarVal->val.array.rawScalar); - *pstate = pushState(pstate); - result = &(*pstate)->contVal; - (*pstate)->contVal.type = jbvArray; - (*pstate)->contVal.val.array.nElems = 0; - (*pstate)->contVal.val.array.rawScalar = (scalarVal && - scalarVal->val.array.rawScalar); + ppstate = pushState(pstate); + val = &ppstate->contVal; + val->type = jbvArray; + val->val.array.nElems = 0; + val->val.array.rawScalar = (scalarVal && + scalarVal->val.array.rawScalar); if (scalarVal && scalarVal->val.array.nElems > 0) { /* Assume that this array is still really a scalar */ Assert(scalarVal->type == jbvArray); - (*pstate)->size = scalarVal->val.array.nElems; + ppstate->size = scalarVal->val.array.nElems; } else { - (*pstate)->size = 4; + ppstate->size = 4; /* initial guess at array size */ } - (*pstate)->contVal.val.array.elems = palloc(sizeof(JsonbValue) * - (*pstate)->size); + outcontext = pstate->outcontext ? pstate->outcontext : CurrentMemoryContext; + val->val.array.elems = MemoryContextAlloc(outcontext, + sizeof(JsonbValue) * + ppstate->size); break; case WJB_BEGIN_OBJECT: Assert(!scalarVal); - *pstate = pushState(pstate); - result = &(*pstate)->contVal; - (*pstate)->contVal.type = jbvObject; - (*pstate)->contVal.val.object.nPairs = 0; - (*pstate)->size = 4; - (*pstate)->contVal.val.object.pairs = palloc(sizeof(JsonbPair) * - (*pstate)->size); + ppstate = pushState(pstate); + val = &ppstate->contVal; + val->type = jbvObject; + val->val.object.nPairs = 0; + ppstate->size = 4; /* initial guess at object size */ + outcontext = pstate->outcontext ? pstate->outcontext : CurrentMemoryContext; + val->val.object.pairs = MemoryContextAlloc(outcontext, + sizeof(JsonbPair) * + ppstate->size); break; case WJB_KEY: Assert(scalarVal->type == jbvString); - appendKey(*pstate, scalarVal); + appendKey(pstate, scalarVal, true); break; case WJB_VALUE: Assert(IsAJsonbScalar(scalarVal)); - appendValue(*pstate, scalarVal); + appendValue(pstate, scalarVal, true); break; case WJB_ELEM: Assert(IsAJsonbScalar(scalarVal)); - appendElement(*pstate, scalarVal); + appendElement(pstate, scalarVal, true); break; case WJB_END_OBJECT: - uniqueifyJsonbObject(&(*pstate)->contVal, - (*pstate)->unique_keys, - (*pstate)->skip_nulls); + ppstate = pstate->parseState; + uniqueifyJsonbObject(&ppstate->contVal, + ppstate->unique_keys, + ppstate->skip_nulls); /* fall through! */ case WJB_END_ARRAY: /* Steps here common to WJB_END_OBJECT case */ Assert(!scalarVal); - result = &(*pstate)->contVal; + ppstate = pstate->parseState; + val = &ppstate->contVal; /* * Pop stack and push current array/object as value in parent - * array/object + * array/object, or return it as the final result. We don't need + * to re-copy any scalars that are in the data structure. */ - *pstate = (*pstate)->next; - if (*pstate) + pstate->parseState = ppstate = ppstate->next; + if (ppstate) { - switch ((*pstate)->contVal.type) + switch (ppstate->contVal.type) { case jbvArray: - appendElement(*pstate, result); + appendElement(pstate, val, false); break; case jbvObject: - appendValue(*pstate, result); + appendValue(pstate, val, false); break; default: elog(ERROR, "invalid jsonb container type"); } } + else + pstate->result = val; break; default: elog(ERROR, "unrecognized jsonb sequential processing token"); } - - return result; } /* - * pushJsonbValue() worker: Iteration-like forming of Jsonb + * Push a new JsonbParseState onto the JsonbInState's stack + * + * As a notational convenience, the new state's address is returned. + * The caller must initialize the new state's contVal and size fields. */ static JsonbParseState * -pushState(JsonbParseState **pstate) +pushState(JsonbInState *pstate) { - JsonbParseState *ns = palloc(sizeof(JsonbParseState)); + MemoryContext outcontext = pstate->outcontext ? pstate->outcontext : CurrentMemoryContext; + JsonbParseState *ns = MemoryContextAlloc(outcontext, + sizeof(JsonbParseState)); - ns->next = *pstate; + ns->next = pstate->parseState; + /* This module never changes these fields, but callers can: */ ns->unique_keys = false; ns->skip_nulls = false; + pstate->parseState = ns; return ns; } /* - * pushJsonbValue() worker: Append a pair key to state when generating a Jsonb + * pushJsonbValue() worker: Append a pair key to pstate */ static void -appendKey(JsonbParseState *pstate, JsonbValue *string) +appendKey(JsonbInState *pstate, JsonbValue *string, bool needCopy) { - JsonbValue *object = &pstate->contVal; + JsonbValue locstring = *string; + JsonbParseState *ppstate = pstate->parseState; + JsonbValue *object = &ppstate->contVal; Assert(object->type == jbvObject); - Assert(string->type == jbvString); + Assert(locstring.type == jbvString); + + if (needCopy) + copyScalarSubstructure(&locstring, pstate->outcontext); if (object->val.object.nPairs >= JSONB_MAX_PAIRS) ereport(ERROR, @@ -757,55 +803,136 @@ appendKey(JsonbParseState *pstate, JsonbValue *string) errmsg("number of jsonb object pairs exceeds the maximum allowed (%zu)", JSONB_MAX_PAIRS))); - if (object->val.object.nPairs >= pstate->size) + if (object->val.object.nPairs >= ppstate->size) { - pstate->size *= 2; + ppstate->size *= 2; object->val.object.pairs = repalloc(object->val.object.pairs, - sizeof(JsonbPair) * pstate->size); + sizeof(JsonbPair) * ppstate->size); } - object->val.object.pairs[object->val.object.nPairs].key = *string; + object->val.object.pairs[object->val.object.nPairs].key = locstring; object->val.object.pairs[object->val.object.nPairs].order = object->val.object.nPairs; } /* - * pushJsonbValue() worker: Append a pair value to state when generating a - * Jsonb + * pushJsonbValue() worker: Append a pair value to pstate */ static void -appendValue(JsonbParseState *pstate, JsonbValue *scalarVal) +appendValue(JsonbInState *pstate, JsonbValue *scalarVal, bool needCopy) { - JsonbValue *object = &pstate->contVal; + JsonbValue locscalar = *scalarVal; + JsonbValue *object = &pstate->parseState->contVal; Assert(object->type == jbvObject); - object->val.object.pairs[object->val.object.nPairs++].value = *scalarVal; + if (needCopy) + copyScalarSubstructure(&locscalar, pstate->outcontext); + + object->val.object.pairs[object->val.object.nPairs++].value = locscalar; } /* - * pushJsonbValue() worker: Append an element to state when generating a Jsonb + * pushJsonbValue() worker: Append an array element to pstate */ static void -appendElement(JsonbParseState *pstate, JsonbValue *scalarVal) +appendElement(JsonbInState *pstate, JsonbValue *scalarVal, bool needCopy) { - JsonbValue *array = &pstate->contVal; + JsonbValue locscalar = *scalarVal; + JsonbParseState *ppstate = pstate->parseState; + JsonbValue *array = &ppstate->contVal; Assert(array->type == jbvArray); + if (needCopy) + copyScalarSubstructure(&locscalar, pstate->outcontext); + if (array->val.array.nElems >= JSONB_MAX_ELEMS) ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), errmsg("number of jsonb array elements exceeds the maximum allowed (%zu)", JSONB_MAX_ELEMS))); - if (array->val.array.nElems >= pstate->size) + if (array->val.array.nElems >= ppstate->size) { - pstate->size *= 2; + ppstate->size *= 2; array->val.array.elems = repalloc(array->val.array.elems, - sizeof(JsonbValue) * pstate->size); + sizeof(JsonbValue) * ppstate->size); } - array->val.array.elems[array->val.array.nElems++] = *scalarVal; + array->val.array.elems[array->val.array.nElems++] = locscalar; +} + +/* + * Copy any infrastructure of a scalar JsonbValue into the outcontext, + * adjusting the pointer(s) in *v. + * + * We need not deal with containers here, as the routines above ensure + * that they are built fresh. + */ +static void +copyScalarSubstructure(JsonbValue *v, MemoryContext outcontext) +{ + /* Nothing to do if caller did not specify an outcontext */ + if (outcontext == NULL) + return; + switch (v->type) + { + case jbvNull: + case jbvBool: + /* pass-by-value, nothing to do */ + break; + case jbvString: + { + char *buf = MemoryContextAlloc(outcontext, + v->val.string.len); + + memcpy(buf, v->val.string.val, v->val.string.len); + v->val.string.val = buf; + } + break; + case jbvNumeric: + { + MemoryContext oldcontext = MemoryContextSwitchTo(outcontext); + + v->val.numeric = + DatumGetNumeric(datumCopy(NumericGetDatum(v->val.numeric), + false, -1)); + MemoryContextSwitchTo(oldcontext); + } + break; + case jbvDatetime: + { + MemoryContext oldcontext = MemoryContextSwitchTo(outcontext); + + switch (v->val.datetime.typid) + { + case DATEOID: + /* always pass-by-value */ + break; + case TIMETZOID: + /* always pass-by-reference */ + v->val.datetime.value = datumCopy(v->val.datetime.value, + false, 12); + break; + case TIMEOID: + case TIMESTAMPOID: + case TIMESTAMPTZOID: + /* pass-by-value when float8 is */ +#ifndef USE_FLOAT8_BYVAL + v->val.datetime.value = datumCopy(v->val.datetime.value, + false, 8); +#endif + break; + default: + elog(ERROR, "unexpected jsonb datetime type oid %u", + v->val.datetime.typid); + } + MemoryContextSwitchTo(oldcontext); + } + break; + default: + elog(ERROR, "invalid jsonb scalar type"); + } } /* diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c index bcb1720b6cd..f04fdd6833b 100644 --- a/src/backend/utils/adt/jsonfuncs.c +++ b/src/backend/utils/adt/jsonfuncs.c @@ -475,18 +475,18 @@ static Datum populate_domain(DomainIOData *io, Oid typid, const char *colname, Node *escontext, bool omit_quotes); /* functions supporting jsonb_delete, jsonb_set and jsonb_concat */ -static JsonbValue *IteratorConcat(JsonbIterator **it1, JsonbIterator **it2, - JsonbParseState **state); -static JsonbValue *setPath(JsonbIterator **it, Datum *path_elems, - bool *path_nulls, int path_len, - JsonbParseState **st, int level, JsonbValue *newval, - int op_type); +static void IteratorConcat(JsonbIterator **it1, JsonbIterator **it2, + JsonbInState *state); +static void setPath(JsonbIterator **it, Datum *path_elems, + bool *path_nulls, int path_len, + JsonbInState *st, int level, JsonbValue *newval, + int op_type); static void setPathObject(JsonbIterator **it, Datum *path_elems, - bool *path_nulls, int path_len, JsonbParseState **st, + bool *path_nulls, int path_len, JsonbInState *st, int level, JsonbValue *newval, uint32 npairs, int op_type); static void setPathArray(JsonbIterator **it, Datum *path_elems, - bool *path_nulls, int path_len, JsonbParseState **st, + bool *path_nulls, int path_len, JsonbInState *st, int level, JsonbValue *newval, uint32 nelems, int op_type); @@ -1679,8 +1679,7 @@ Datum jsonb_set_element(Jsonb *jb, Datum *path, int path_len, JsonbValue *newval) { - JsonbValue *res; - JsonbParseState *state = NULL; + JsonbInState state = {0}; JsonbIterator *it; bool *path_nulls = palloc0(path_len * sizeof(bool)); @@ -1689,17 +1688,17 @@ jsonb_set_element(Jsonb *jb, Datum *path, int path_len, it = JsonbIteratorInit(&jb->root); - res = setPath(&it, path, path_nulls, path_len, &state, 0, newval, - JB_PATH_CREATE | JB_PATH_FILL_GAPS | - JB_PATH_CONSISTENT_POSITION); + setPath(&it, path, path_nulls, path_len, &state, 0, newval, + JB_PATH_CREATE | JB_PATH_FILL_GAPS | + JB_PATH_CONSISTENT_POSITION); pfree(path_nulls); - PG_RETURN_JSONB_P(JsonbValueToJsonb(res)); + PG_RETURN_JSONB_P(JsonbValueToJsonb(state.result)); } static void -push_null_elements(JsonbParseState **ps, int num) +push_null_elements(JsonbInState *ps, int num) { JsonbValue null; @@ -1718,7 +1717,7 @@ push_null_elements(JsonbParseState **ps, int num) * Caller is responsible to make sure such path does not exist yet. */ static void -push_path(JsonbParseState **st, int level, Datum *path_elems, +push_path(JsonbInState *st, int level, Datum *path_elems, bool *path_nulls, int path_len, JsonbValue *newval) { /* @@ -1758,15 +1757,15 @@ push_path(JsonbParseState **st, int level, Datum *path_elems, newkey.val.string.val = c; newkey.val.string.len = strlen(c); - (void) pushJsonbValue(st, WJB_BEGIN_OBJECT, NULL); - (void) pushJsonbValue(st, WJB_KEY, &newkey); + pushJsonbValue(st, WJB_BEGIN_OBJECT, NULL); + pushJsonbValue(st, WJB_KEY, &newkey); tpath[i - level] = jbvObject; } else { /* integer, an array is expected */ - (void) pushJsonbValue(st, WJB_BEGIN_ARRAY, NULL); + pushJsonbValue(st, WJB_BEGIN_ARRAY, NULL); push_null_elements(st, lindex); @@ -1776,11 +1775,9 @@ push_path(JsonbParseState **st, int level, Datum *path_elems, /* Insert an actual value for either an object or array */ if (tpath[(path_len - level) - 1] == jbvArray) - { - (void) pushJsonbValue(st, WJB_ELEM, newval); - } + pushJsonbValue(st, WJB_ELEM, newval); else - (void) pushJsonbValue(st, WJB_VALUE, newval); + pushJsonbValue(st, WJB_VALUE, newval); /* * Close everything up to the last but one level. The last one will be @@ -1792,9 +1789,9 @@ push_path(JsonbParseState **st, int level, Datum *path_elems, break; if (tpath[i - level] == jbvObject) - (void) pushJsonbValue(st, WJB_END_OBJECT, NULL); + pushJsonbValue(st, WJB_END_OBJECT, NULL); else - (void) pushJsonbValue(st, WJB_END_ARRAY, NULL); + pushJsonbValue(st, WJB_END_ARRAY, NULL); } } @@ -4542,8 +4539,7 @@ jsonb_strip_nulls(PG_FUNCTION_ARGS) Jsonb *jb = PG_GETARG_JSONB_P(0); bool strip_in_arrays = false; JsonbIterator *it; - JsonbParseState *parseState = NULL; - JsonbValue *res = NULL; + JsonbInState parseState = {0}; JsonbValue v, k; JsonbIteratorToken type; @@ -4579,7 +4575,7 @@ jsonb_strip_nulls(PG_FUNCTION_ARGS) continue; /* otherwise, do a delayed push of the key */ - (void) pushJsonbValue(&parseState, WJB_KEY, &k); + pushJsonbValue(&parseState, WJB_KEY, &k); } /* if strip_in_arrays is set, also skip null array elements */ @@ -4588,14 +4584,12 @@ jsonb_strip_nulls(PG_FUNCTION_ARGS) continue; if (type == WJB_VALUE || type == WJB_ELEM) - res = pushJsonbValue(&parseState, type, &v); + pushJsonbValue(&parseState, type, &v); else - res = pushJsonbValue(&parseState, type, NULL); + pushJsonbValue(&parseState, type, NULL); } - Assert(res != NULL); - - PG_RETURN_POINTER(JsonbValueToJsonb(res)); + PG_RETURN_POINTER(JsonbValueToJsonb(parseState.result)); } /* @@ -4624,8 +4618,7 @@ jsonb_concat(PG_FUNCTION_ARGS) { Jsonb *jb1 = PG_GETARG_JSONB_P(0); Jsonb *jb2 = PG_GETARG_JSONB_P(1); - JsonbParseState *state = NULL; - JsonbValue *res; + JsonbInState state = {0}; JsonbIterator *it1, *it2; @@ -4646,11 +4639,9 @@ jsonb_concat(PG_FUNCTION_ARGS) it1 = JsonbIteratorInit(&jb1->root); it2 = JsonbIteratorInit(&jb2->root); - res = IteratorConcat(&it1, &it2, &state); - - Assert(res != NULL); + IteratorConcat(&it1, &it2, &state); - PG_RETURN_JSONB_P(JsonbValueToJsonb(res)); + PG_RETURN_JSONB_P(JsonbValueToJsonb(state.result)); } @@ -4667,10 +4658,9 @@ jsonb_delete(PG_FUNCTION_ARGS) text *key = PG_GETARG_TEXT_PP(1); char *keyptr = VARDATA_ANY(key); int keylen = VARSIZE_ANY_EXHDR(key); - JsonbParseState *state = NULL; + JsonbInState pstate = {0}; JsonbIterator *it; - JsonbValue v, - *res = NULL; + JsonbValue v; bool skipNested = false; JsonbIteratorToken r; @@ -4699,12 +4689,10 @@ jsonb_delete(PG_FUNCTION_ARGS) continue; } - res = pushJsonbValue(&state, r, r < WJB_BEGIN_ARRAY ? &v : NULL); + pushJsonbValue(&pstate, r, r < WJB_BEGIN_ARRAY ? &v : NULL); } - Assert(res != NULL); - - PG_RETURN_JSONB_P(JsonbValueToJsonb(res)); + PG_RETURN_JSONB_P(JsonbValueToJsonb(pstate.result)); } /* @@ -4721,10 +4709,9 @@ jsonb_delete_array(PG_FUNCTION_ARGS) Datum *keys_elems; bool *keys_nulls; int keys_len; - JsonbParseState *state = NULL; + JsonbInState pstate = {0}; JsonbIterator *it; - JsonbValue v, - *res = NULL; + JsonbValue v; bool skipNested = false; JsonbIteratorToken r; @@ -4785,12 +4772,10 @@ jsonb_delete_array(PG_FUNCTION_ARGS) } } - res = pushJsonbValue(&state, r, r < WJB_BEGIN_ARRAY ? &v : NULL); + pushJsonbValue(&pstate, r, r < WJB_BEGIN_ARRAY ? &v : NULL); } - Assert(res != NULL); - - PG_RETURN_JSONB_P(JsonbValueToJsonb(res)); + PG_RETURN_JSONB_P(JsonbValueToJsonb(pstate.result)); } /* @@ -4805,12 +4790,11 @@ jsonb_delete_idx(PG_FUNCTION_ARGS) { Jsonb *in = PG_GETARG_JSONB_P(0); int idx = PG_GETARG_INT32(1); - JsonbParseState *state = NULL; + JsonbInState pstate = {0}; JsonbIterator *it; uint32 i = 0, n; - JsonbValue v, - *res = NULL; + JsonbValue v; JsonbIteratorToken r; if (JB_ROOT_IS_SCALAR(in)) @@ -4843,7 +4827,7 @@ jsonb_delete_idx(PG_FUNCTION_ARGS) if (idx >= n) PG_RETURN_JSONB_P(in); - pushJsonbValue(&state, r, NULL); + pushJsonbValue(&pstate, r, NULL); while ((r = JsonbIteratorNext(&it, &v, true)) != WJB_DONE) { @@ -4853,12 +4837,10 @@ jsonb_delete_idx(PG_FUNCTION_ARGS) continue; } - res = pushJsonbValue(&state, r, r < WJB_BEGIN_ARRAY ? &v : NULL); + pushJsonbValue(&pstate, r, r < WJB_BEGIN_ARRAY ? &v : NULL); } - Assert(res != NULL); - - PG_RETURN_JSONB_P(JsonbValueToJsonb(res)); + PG_RETURN_JSONB_P(JsonbValueToJsonb(pstate.result)); } /* @@ -4872,12 +4854,11 @@ jsonb_set(PG_FUNCTION_ARGS) Jsonb *newjsonb = PG_GETARG_JSONB_P(2); JsonbValue newval; bool create = PG_GETARG_BOOL(3); - JsonbValue *res = NULL; Datum *path_elems; bool *path_nulls; int path_len; JsonbIterator *it; - JsonbParseState *st = NULL; + JsonbInState st = {0}; JsonbToJsonbValue(newjsonb, &newval); @@ -4901,12 +4882,10 @@ jsonb_set(PG_FUNCTION_ARGS) it = JsonbIteratorInit(&in->root); - res = setPath(&it, path_elems, path_nulls, path_len, &st, - 0, &newval, create ? JB_PATH_CREATE : JB_PATH_REPLACE); - - Assert(res != NULL); + setPath(&it, path_elems, path_nulls, path_len, &st, + 0, &newval, create ? JB_PATH_CREATE : JB_PATH_REPLACE); - PG_RETURN_JSONB_P(JsonbValueToJsonb(res)); + PG_RETURN_JSONB_P(JsonbValueToJsonb(st.result)); } @@ -4985,12 +4964,11 @@ jsonb_delete_path(PG_FUNCTION_ARGS) { Jsonb *in = PG_GETARG_JSONB_P(0); ArrayType *path = PG_GETARG_ARRAYTYPE_P(1); - JsonbValue *res = NULL; Datum *path_elems; bool *path_nulls; int path_len; JsonbIterator *it; - JsonbParseState *st = NULL; + JsonbInState st = {0}; if (ARR_NDIM(path) > 1) ereport(ERROR, @@ -5012,12 +4990,10 @@ jsonb_delete_path(PG_FUNCTION_ARGS) it = JsonbIteratorInit(&in->root); - res = setPath(&it, path_elems, path_nulls, path_len, &st, - 0, NULL, JB_PATH_DELETE); + setPath(&it, path_elems, path_nulls, path_len, &st, + 0, NULL, JB_PATH_DELETE); - Assert(res != NULL); - - PG_RETURN_JSONB_P(JsonbValueToJsonb(res)); + PG_RETURN_JSONB_P(JsonbValueToJsonb(st.result)); } /* @@ -5031,12 +5007,11 @@ jsonb_insert(PG_FUNCTION_ARGS) Jsonb *newjsonb = PG_GETARG_JSONB_P(2); JsonbValue newval; bool after = PG_GETARG_BOOL(3); - JsonbValue *res = NULL; Datum *path_elems; bool *path_nulls; int path_len; JsonbIterator *it; - JsonbParseState *st = NULL; + JsonbInState st = {0}; JsonbToJsonbValue(newjsonb, &newval); @@ -5057,12 +5032,10 @@ jsonb_insert(PG_FUNCTION_ARGS) it = JsonbIteratorInit(&in->root); - res = setPath(&it, path_elems, path_nulls, path_len, &st, 0, &newval, - after ? JB_PATH_INSERT_AFTER : JB_PATH_INSERT_BEFORE); - - Assert(res != NULL); + setPath(&it, path_elems, path_nulls, path_len, &st, 0, &newval, + after ? JB_PATH_INSERT_AFTER : JB_PATH_INSERT_BEFORE); - PG_RETURN_JSONB_P(JsonbValueToJsonb(res)); + PG_RETURN_JSONB_P(JsonbValueToJsonb(st.result)); } /* @@ -5072,13 +5045,12 @@ jsonb_insert(PG_FUNCTION_ARGS) * In that case we just append the content of it2 to it1 without any * verifications. */ -static JsonbValue * +static void IteratorConcat(JsonbIterator **it1, JsonbIterator **it2, - JsonbParseState **state) + JsonbInState *state) { JsonbValue v1, - v2, - *res = NULL; + v2; JsonbIteratorToken r1, r2, rk1, @@ -5109,7 +5081,7 @@ IteratorConcat(JsonbIterator **it1, JsonbIterator **it2, * automatically override the value from the first object. */ while ((r2 = JsonbIteratorNext(it2, &v2, true)) != WJB_DONE) - res = pushJsonbValue(state, r2, r2 != WJB_END_OBJECT ? &v2 : NULL); + pushJsonbValue(state, r2, r2 != WJB_END_OBJECT ? &v2 : NULL); } else if (rk1 == WJB_BEGIN_ARRAY && rk2 == WJB_BEGIN_ARRAY) { @@ -5130,7 +5102,7 @@ IteratorConcat(JsonbIterator **it1, JsonbIterator **it2, pushJsonbValue(state, WJB_ELEM, &v2); } - res = pushJsonbValue(state, WJB_END_ARRAY, NULL /* signal to sort */ ); + pushJsonbValue(state, WJB_END_ARRAY, NULL /* signal to sort */ ); } else if (rk1 == WJB_BEGIN_OBJECT) { @@ -5146,7 +5118,7 @@ IteratorConcat(JsonbIterator **it1, JsonbIterator **it2, pushJsonbValue(state, r1, r1 != WJB_END_OBJECT ? &v1 : NULL); while ((r2 = JsonbIteratorNext(it2, &v2, true)) != WJB_DONE) - res = pushJsonbValue(state, r2, r2 != WJB_END_ARRAY ? &v2 : NULL); + pushJsonbValue(state, r2, r2 != WJB_END_ARRAY ? &v2 : NULL); } else { @@ -5165,10 +5137,8 @@ IteratorConcat(JsonbIterator **it1, JsonbIterator **it2, while ((r2 = JsonbIteratorNext(it2, &v2, true)) != WJB_DONE) pushJsonbValue(state, r2, r2 != WJB_END_OBJECT ? &v2 : NULL); - res = pushJsonbValue(state, WJB_END_ARRAY, NULL); + pushJsonbValue(state, WJB_END_ARRAY, NULL); } - - return res; } /* @@ -5200,14 +5170,13 @@ IteratorConcat(JsonbIterator **it1, JsonbIterator **it2, * All path elements before the last must already exist * whatever bits in op_type are set, or nothing is done. */ -static JsonbValue * +static void setPath(JsonbIterator **it, Datum *path_elems, bool *path_nulls, int path_len, - JsonbParseState **st, int level, JsonbValue *newval, int op_type) + JsonbInState *st, int level, JsonbValue *newval, int op_type) { JsonbValue v; JsonbIteratorToken r; - JsonbValue *res; check_stack_depth(); @@ -5237,20 +5206,20 @@ setPath(JsonbIterator **it, Datum *path_elems, errdetail("The path assumes key is a composite object, " "but it is a scalar value."))); - (void) pushJsonbValue(st, r, NULL); + pushJsonbValue(st, r, NULL); setPathArray(it, path_elems, path_nulls, path_len, st, level, newval, v.val.array.nElems, op_type); r = JsonbIteratorNext(it, &v, false); Assert(r == WJB_END_ARRAY); - res = pushJsonbValue(st, r, NULL); + pushJsonbValue(st, r, NULL); break; case WJB_BEGIN_OBJECT: - (void) pushJsonbValue(st, r, NULL); + pushJsonbValue(st, r, NULL); setPathObject(it, path_elems, path_nulls, path_len, st, level, newval, v.val.object.nPairs, op_type); r = JsonbIteratorNext(it, &v, true); Assert(r == WJB_END_OBJECT); - res = pushJsonbValue(st, r, NULL); + pushJsonbValue(st, r, NULL); break; case WJB_ELEM: case WJB_VALUE: @@ -5268,15 +5237,12 @@ setPath(JsonbIterator **it, Datum *path_elems, errdetail("The path assumes key is a composite object, " "but it is a scalar value."))); - res = pushJsonbValue(st, r, &v); + pushJsonbValue(st, r, &v); break; default: elog(ERROR, "unrecognized iterator result: %d", (int) r); - res = NULL; /* keep compiler quiet */ break; } - - return res; } /* @@ -5284,7 +5250,7 @@ setPath(JsonbIterator **it, Datum *path_elems, */ static void setPathObject(JsonbIterator **it, Datum *path_elems, bool *path_nulls, - int path_len, JsonbParseState **st, int level, + int path_len, JsonbInState *st, int level, JsonbValue *newval, uint32 npairs, int op_type) { text *pathelem = NULL; @@ -5311,8 +5277,8 @@ setPathObject(JsonbIterator **it, Datum *path_elems, bool *path_nulls, newkey.val.string.val = VARDATA_ANY(pathelem); newkey.val.string.len = VARSIZE_ANY_EXHDR(pathelem); - (void) pushJsonbValue(st, WJB_KEY, &newkey); - (void) pushJsonbValue(st, WJB_VALUE, newval); + pushJsonbValue(st, WJB_KEY, &newkey); + pushJsonbValue(st, WJB_VALUE, newval); } for (i = 0; i < npairs; i++) @@ -5344,13 +5310,13 @@ setPathObject(JsonbIterator **it, Datum *path_elems, bool *path_nulls, r = JsonbIteratorNext(it, &v, true); /* skip value */ if (!(op_type & JB_PATH_DELETE)) { - (void) pushJsonbValue(st, WJB_KEY, &k); - (void) pushJsonbValue(st, WJB_VALUE, newval); + pushJsonbValue(st, WJB_KEY, &k); + pushJsonbValue(st, WJB_VALUE, newval); } } else { - (void) pushJsonbValue(st, r, &k); + pushJsonbValue(st, r, &k); setPath(it, path_elems, path_nulls, path_len, st, level + 1, newval, op_type); } @@ -5366,13 +5332,13 @@ setPathObject(JsonbIterator **it, Datum *path_elems, bool *path_nulls, newkey.val.string.val = VARDATA_ANY(pathelem); newkey.val.string.len = VARSIZE_ANY_EXHDR(pathelem); - (void) pushJsonbValue(st, WJB_KEY, &newkey); - (void) pushJsonbValue(st, WJB_VALUE, newval); + pushJsonbValue(st, WJB_KEY, &newkey); + pushJsonbValue(st, WJB_VALUE, newval); } - (void) pushJsonbValue(st, r, &k); + pushJsonbValue(st, r, &k); r = JsonbIteratorNext(it, &v, false); - (void) pushJsonbValue(st, r, r < WJB_BEGIN_ARRAY ? &v : NULL); + pushJsonbValue(st, r, r < WJB_BEGIN_ARRAY ? &v : NULL); if (r == WJB_BEGIN_ARRAY || r == WJB_BEGIN_OBJECT) { int walking_level = 1; @@ -5386,7 +5352,7 @@ setPathObject(JsonbIterator **it, Datum *path_elems, bool *path_nulls, if (r == WJB_END_ARRAY || r == WJB_END_OBJECT) --walking_level; - (void) pushJsonbValue(st, r, r < WJB_BEGIN_ARRAY ? &v : NULL); + pushJsonbValue(st, r, r < WJB_BEGIN_ARRAY ? &v : NULL); } } } @@ -5410,9 +5376,9 @@ setPathObject(JsonbIterator **it, Datum *path_elems, bool *path_nulls, newkey.val.string.val = VARDATA_ANY(pathelem); newkey.val.string.len = VARSIZE_ANY_EXHDR(pathelem); - (void) pushJsonbValue(st, WJB_KEY, &newkey); - (void) push_path(st, level, path_elems, path_nulls, - path_len, newval); + pushJsonbValue(st, WJB_KEY, &newkey); + push_path(st, level, path_elems, path_nulls, + path_len, newval); /* Result is closed with WJB_END_OBJECT outside of this function */ } @@ -5423,7 +5389,7 @@ setPathObject(JsonbIterator **it, Datum *path_elems, bool *path_nulls, */ static void setPathArray(JsonbIterator **it, Datum *path_elems, bool *path_nulls, - int path_len, JsonbParseState **st, int level, + int path_len, JsonbInState *st, int level, JsonbValue *newval, uint32 nelems, int op_type) { JsonbValue v; @@ -5491,7 +5457,7 @@ setPathArray(JsonbIterator **it, Datum *path_elems, bool *path_nulls, if (op_type & JB_PATH_FILL_GAPS && nelems == 0 && idx > 0) push_null_elements(st, idx); - (void) pushJsonbValue(st, WJB_ELEM, newval); + pushJsonbValue(st, WJB_ELEM, newval); done = true; } @@ -5510,7 +5476,7 @@ setPathArray(JsonbIterator **it, Datum *path_elems, bool *path_nulls, r = JsonbIteratorNext(it, &v, true); /* skip */ if (op_type & (JB_PATH_INSERT_BEFORE | JB_PATH_CREATE)) - (void) pushJsonbValue(st, WJB_ELEM, newval); + pushJsonbValue(st, WJB_ELEM, newval); /* * We should keep current value only in case of @@ -5518,20 +5484,20 @@ setPathArray(JsonbIterator **it, Datum *path_elems, bool *path_nulls, * otherwise it should be deleted or replaced */ if (op_type & (JB_PATH_INSERT_AFTER | JB_PATH_INSERT_BEFORE)) - (void) pushJsonbValue(st, r, &v); + pushJsonbValue(st, r, &v); if (op_type & (JB_PATH_INSERT_AFTER | JB_PATH_REPLACE)) - (void) pushJsonbValue(st, WJB_ELEM, newval); + pushJsonbValue(st, WJB_ELEM, newval); } else - (void) setPath(it, path_elems, path_nulls, path_len, - st, level + 1, newval, op_type); + setPath(it, path_elems, path_nulls, path_len, + st, level + 1, newval, op_type); } else { r = JsonbIteratorNext(it, &v, false); - (void) pushJsonbValue(st, r, r < WJB_BEGIN_ARRAY ? &v : NULL); + pushJsonbValue(st, r, r < WJB_BEGIN_ARRAY ? &v : NULL); if (r == WJB_BEGIN_ARRAY || r == WJB_BEGIN_OBJECT) { @@ -5546,7 +5512,7 @@ setPathArray(JsonbIterator **it, Datum *path_elems, bool *path_nulls, if (r == WJB_END_ARRAY || r == WJB_END_OBJECT) --walking_level; - (void) pushJsonbValue(st, r, r < WJB_BEGIN_ARRAY ? &v : NULL); + pushJsonbValue(st, r, r < WJB_BEGIN_ARRAY ? &v : NULL); } } } @@ -5561,7 +5527,7 @@ setPathArray(JsonbIterator **it, Datum *path_elems, bool *path_nulls, if (op_type & JB_PATH_FILL_GAPS && idx > nelems) push_null_elements(st, idx - nelems); - (void) pushJsonbValue(st, WJB_ELEM, newval); + pushJsonbValue(st, WJB_ELEM, newval); done = true; } @@ -5580,8 +5546,8 @@ setPathArray(JsonbIterator **it, Datum *path_elems, bool *path_nulls, if (idx > 0) push_null_elements(st, idx - nelems); - (void) push_path(st, level, path_elems, path_nulls, - path_len, newval); + push_path(st, level, path_elems, path_nulls, + path_len, newval); /* Result is closed with WJB_END_OBJECT outside of this function */ } @@ -5807,10 +5773,9 @@ transform_jsonb_string_values(Jsonb *jsonb, void *action_state, JsonTransformStringValuesAction transform_action) { JsonbIterator *it; - JsonbValue v, - *res = NULL; + JsonbValue v; JsonbIteratorToken type; - JsonbParseState *st = NULL; + JsonbInState st = {0}; text *out; bool is_scalar = false; @@ -5826,20 +5791,20 @@ transform_jsonb_string_values(Jsonb *jsonb, void *action_state, out = pg_detoast_datum_packed(out); v.val.string.val = VARDATA_ANY(out); v.val.string.len = VARSIZE_ANY_EXHDR(out); - res = pushJsonbValue(&st, type, type < WJB_BEGIN_ARRAY ? &v : NULL); + pushJsonbValue(&st, type, type < WJB_BEGIN_ARRAY ? &v : NULL); } else { - res = pushJsonbValue(&st, type, (type == WJB_KEY || - type == WJB_VALUE || - type == WJB_ELEM) ? &v : NULL); + pushJsonbValue(&st, type, (type == WJB_KEY || + type == WJB_VALUE || + type == WJB_ELEM) ? &v : NULL); } } - if (res->type == jbvArray) - res->val.array.rawScalar = is_scalar; + if (st.result->type == jbvArray) + st.result->val.array.rawScalar = is_scalar; - return JsonbValueToJsonb(res); + return JsonbValueToJsonb(st.result); } /* diff --git a/src/backend/utils/adt/jsonpath_exec.c b/src/backend/utils/adt/jsonpath_exec.c index dbab24737ef..6f67d43020c 100644 --- a/src/backend/utils/adt/jsonpath_exec.c +++ b/src/backend/utils/adt/jsonpath_exec.c @@ -2872,8 +2872,7 @@ executeKeyValueMethod(JsonPathExecContext *cxt, JsonPathItem *jsp, { JsonBaseObjectInfo baseObject; JsonbValue obj; - JsonbParseState *ps; - JsonbValue *keyval; + JsonbInState ps; Jsonb *jsonb; if (tok != WJB_KEY) @@ -2887,7 +2886,8 @@ executeKeyValueMethod(JsonPathExecContext *cxt, JsonPathItem *jsp, tok = JsonbIteratorNext(&it, &val, true); Assert(tok == WJB_VALUE); - ps = NULL; + memset(&ps, 0, sizeof(ps)); + pushJsonbValue(&ps, WJB_BEGIN_OBJECT, NULL); pushJsonbValue(&ps, WJB_KEY, &keystr); @@ -2899,9 +2899,9 @@ executeKeyValueMethod(JsonPathExecContext *cxt, JsonPathItem *jsp, pushJsonbValue(&ps, WJB_KEY, &idstr); pushJsonbValue(&ps, WJB_VALUE, &idval); - keyval = pushJsonbValue(&ps, WJB_END_OBJECT, NULL); + pushJsonbValue(&ps, WJB_END_OBJECT, NULL); - jsonb = JsonbValueToJsonb(keyval); + jsonb = JsonbValueToJsonb(ps.result); JsonbInitBinary(&obj, jsonb); @@ -3647,7 +3647,7 @@ getScalar(JsonbValue *scalar, enum jbvType type) static JsonbValue * wrapItemsInArray(const JsonValueList *items) { - JsonbParseState *ps = NULL; + JsonbInState ps = {0}; JsonValueListIterator it; JsonbValue *jbv; @@ -3657,7 +3657,9 @@ wrapItemsInArray(const JsonValueList *items) while ((jbv = JsonValueListNext(items, &it))) pushJsonbValue(&ps, WJB_ELEM, jbv); - return pushJsonbValue(&ps, WJB_END_ARRAY, NULL); + pushJsonbValue(&ps, WJB_END_ARRAY, NULL); + + return ps.result; } /* Check if the timezone required for casting from type1 to type2 is used */ diff --git a/src/include/utils/jsonb.h b/src/include/utils/jsonb.h index fecb33b9c67..cddcbb63442 100644 --- a/src/include/utils/jsonb.h +++ b/src/include/utils/jsonb.h @@ -67,8 +67,10 @@ typedef enum #define JGINFLAG_HASHED 0x10 /* OR'd into flag if value was hashed */ #define JGIN_MAXLENGTH 125 /* max length of text part before hashing */ +/* Forward struct references */ typedef struct JsonbPair JsonbPair; typedef struct JsonbValue JsonbValue; +typedef struct JsonbParseState JsonbParseState; /* * Jsonbs are varlena objects, so must meet the varlena convention that the @@ -315,15 +317,40 @@ struct JsonbPair uint32 order; /* Pair's index in original sequence */ }; -/* Conversion state used when parsing Jsonb from text, or for type coercion */ -typedef struct JsonbParseState +/* + * State used while constructing or manipulating a JsonbValue. + * For example, when parsing Jsonb from text, we construct a JsonbValue + * data structure and then flatten that into the Jsonb on-disk format. + * JsonbValues are also useful in aggregation and type coercion. + * + * Callers providing a JsonbInState must initialize it to zeroes/nulls, + * except for optionally setting outcontext (if that's left NULL, + * CurrentMemoryContext is used) and escontext (if that's left NULL, + * parsing errors are thrown via ereport). + */ +typedef struct JsonbInState { - JsonbValue contVal; - Size size; - struct JsonbParseState *next; + JsonbValue *result; /* The completed value; NULL until complete */ + MemoryContext outcontext; /* The context to build it in, or NULL */ + struct Node *escontext; /* Optional soft-error-reporting context */ + /* Remaining fields should be treated as private to jsonb.c/jsonb_util.c */ + JsonbParseState *parseState; /* Stack of parsing contexts */ + bool unique_keys; /* Check object key uniqueness */ +} JsonbInState; + +/* + * Parsing context for one level of Jsonb array or object nesting. + * The contVal will be part of the constructed JsonbValue tree, + * but the other fields are just transient state. + */ +struct JsonbParseState +{ + JsonbValue contVal; /* An array or object JsonbValue */ + Size size; /* Allocated length of array or object */ + JsonbParseState *next; /* Link to next outer level, if any */ bool unique_keys; /* Check object key uniqueness */ bool skip_nulls; /* Skip null object fields */ -} JsonbParseState; +}; /* * JsonbIterator holds details of the type for each iteration. It also stores a @@ -404,8 +431,8 @@ extern JsonbValue *getKeyJsonValueFromContainer(JsonbContainer *container, JsonbValue *res); extern JsonbValue *getIthJsonbValueFromContainer(JsonbContainer *container, uint32 i); -extern JsonbValue *pushJsonbValue(JsonbParseState **pstate, - JsonbIteratorToken seq, JsonbValue *jbval); +extern void pushJsonbValue(JsonbInState *pstate, + JsonbIteratorToken seq, JsonbValue *jbval); extern JsonbIterator *JsonbIteratorInit(JsonbContainer *container); extern JsonbIteratorToken JsonbIteratorNext(JsonbIterator **it, JsonbValue *val, bool skipNested); -- 2.43.5 From a1bc3537af7658a6e684ca5440094c9e05218495 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Mon, 21 Jul 2025 22:00:11 -0400 Subject: [PATCH v1 2/3] Remove fundamentally-redundant processing in jsonb_agg() et al. The various variants of jsonb_agg() operate as follows, for each aggregate input value: 1. Build a JsonbValue tree representation of the input value. 2. Flatten the JsonbValue tree into a Jsonb in on-disk format. 3. Iterate through the Jsonb, building a JsonbValue that is part of the aggregate's state stored in aggcontext, but is otherwise identical to what phase 1 built. This is very slightly less silly than it sounds, because phase 1 involves calling non-JSONB code such as datatype output functions, which are likely to leak memory, and we don't want to leak into the aggcontext. Nonetheless, phases 2 and 3 are accomplishing exactly nothing that is useful if we can make phase 1 put the JsonbValue tree where we need it. We could probably do that with a bunch of MemoryContextSwitchTo's, but what seems more robust is to give pushJsonbValue the responsibility of building the JsonbValue tree in a specified non-current memory context. The previous patch created the infrastructure for that, and this patch simply makes the aggregate functions use it and then rips out phases 2 and 3. For me, this makes jsonb_agg() with a text column as input run about 2X faster than before. It's not yet on par with json_agg(), but this removes a whole lot of the difference. --- src/backend/utils/adt/jsonb.c | 303 +++++----------------------------- 1 file changed, 45 insertions(+), 258 deletions(-) diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c index d15d184ef8b..abe0d620566 100644 --- a/src/backend/utils/adt/jsonb.c +++ b/src/backend/utils/adt/jsonb.c @@ -27,7 +27,7 @@ typedef struct JsonbAggState { - JsonbInState *res; + JsonbInState pstate; JsonTypeCategory key_category; Oid key_output_func; JsonTypeCategory val_category; @@ -54,7 +54,6 @@ static void datum_to_jsonb_internal(Datum val, bool is_null, JsonbInState *resul bool key_scalar); static void add_jsonb(Datum val, bool is_null, JsonbInState *result, Oid val_type, bool key_scalar); -static JsonbParseState *clone_parse_state(JsonbParseState *state); static char *JsonbToCStringWorker(StringInfo out, JsonbContainer *in, int estimated_len, bool indent); static void add_indent(StringInfo out, bool indent, int level); @@ -1453,54 +1452,16 @@ close_object: /* - * shallow clone of a parse state, suitable for use in aggregate - * final functions that will only append to the values rather than - * change them. + * Functions for jsonb_agg, jsonb_object_agg, and variants */ -static JsonbParseState * -clone_parse_state(JsonbParseState *state) -{ - JsonbParseState *result, - *icursor, - *ocursor; - - if (state == NULL) - return NULL; - - result = palloc(sizeof(JsonbParseState)); - icursor = state; - ocursor = result; - for (;;) - { - ocursor->contVal = icursor->contVal; - ocursor->size = icursor->size; - ocursor->unique_keys = icursor->unique_keys; - ocursor->skip_nulls = icursor->skip_nulls; - icursor = icursor->next; - if (icursor == NULL) - break; - ocursor->next = palloc(sizeof(JsonbParseState)); - ocursor = ocursor->next; - } - ocursor->next = NULL; - - return result; -} static Datum jsonb_agg_transfn_worker(FunctionCallInfo fcinfo, bool absent_on_null) { - MemoryContext oldcontext, - aggcontext; + MemoryContext aggcontext; JsonbAggState *state; - JsonbInState elem; Datum val; JsonbInState *result; - bool single_scalar = false; - JsonbIterator *it; - Jsonb *jbelem; - JsonbValue v; - JsonbIteratorToken type; if (!AggCheckCallContext(fcinfo, &aggcontext)) { @@ -1519,12 +1480,10 @@ jsonb_agg_transfn_worker(FunctionCallInfo fcinfo, bool absent_on_null) (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("could not determine input data type"))); - oldcontext = MemoryContextSwitchTo(aggcontext); - state = palloc(sizeof(JsonbAggState)); - result = palloc0(sizeof(JsonbInState)); - state->res = result; + state = MemoryContextAllocZero(aggcontext, sizeof(JsonbAggState)); + result = &state->pstate; + result->outcontext = aggcontext; pushJsonbValue(result, WJB_BEGIN_ARRAY, NULL); - MemoryContextSwitchTo(oldcontext); json_categorize_type(arg_type, true, &state->val_category, &state->val_output_func); @@ -1532,74 +1491,23 @@ jsonb_agg_transfn_worker(FunctionCallInfo fcinfo, bool absent_on_null) else { state = (JsonbAggState *) PG_GETARG_POINTER(0); - result = state->res; + result = &state->pstate; } if (absent_on_null && PG_ARGISNULL(1)) PG_RETURN_POINTER(state); - /* turn the argument into jsonb in the normal function context */ - + /* + * We run this code in the normal function context, so that we don't leak + * any cruft from datatype output functions and such into the aggcontext. + * But the "result" JsonbValue will be constructed in aggcontext, so that + * it remains available across calls. + */ val = PG_ARGISNULL(1) ? (Datum) 0 : PG_GETARG_DATUM(1); - memset(&elem, 0, sizeof(JsonbInState)); - - datum_to_jsonb_internal(val, PG_ARGISNULL(1), &elem, state->val_category, + datum_to_jsonb_internal(val, PG_ARGISNULL(1), result, state->val_category, state->val_output_func, false); - jbelem = JsonbValueToJsonb(elem.result); - - /* switch to the aggregate context for accumulation operations */ - - oldcontext = MemoryContextSwitchTo(aggcontext); - - it = JsonbIteratorInit(&jbelem->root); - - while ((type = JsonbIteratorNext(&it, &v, false)) != WJB_DONE) - { - switch (type) - { - case WJB_BEGIN_ARRAY: - if (v.val.array.rawScalar) - single_scalar = true; - else - pushJsonbValue(result, type, NULL); - break; - case WJB_END_ARRAY: - if (!single_scalar) - pushJsonbValue(result, type, NULL); - break; - case WJB_BEGIN_OBJECT: - case WJB_END_OBJECT: - pushJsonbValue(result, type, NULL); - break; - case WJB_ELEM: - case WJB_KEY: - case WJB_VALUE: - if (v.type == jbvString) - { - /* copy string values in the aggregate context */ - char *buf = palloc(v.val.string.len + 1); - - snprintf(buf, v.val.string.len + 1, "%s", v.val.string.val); - v.val.string.val = buf; - } - else if (v.type == jbvNumeric) - { - /* same for numeric */ - v.val.numeric = - DatumGetNumeric(DirectFunctionCall1(numeric_uplus, - NumericGetDatum(v.val.numeric))); - } - pushJsonbValue(result, type, &v); - break; - default: - elog(ERROR, "unknown jsonb iterator token type"); - } - } - - MemoryContextSwitchTo(oldcontext); - PG_RETURN_POINTER(state); } @@ -1637,17 +1545,18 @@ jsonb_agg_finalfn(PG_FUNCTION_ARGS) arg = (JsonbAggState *) PG_GETARG_POINTER(0); /* - * We need to do a shallow clone of the argument in case the final - * function is called more than once, so we avoid changing the argument. A - * shallow clone is sufficient as we aren't going to change any of the - * values, just add the final array end marker. + * The final function can be called more than once, so we must not change + * the stored JsonbValue data structure. Fortunately, the WJB_END_ARRAY + * action will only change fields in the JsonbInState struct itself, so we + * can simply invoke pushJsonbValue on a local copy of that. */ - memset(&result, 0, sizeof(JsonbInState)); - - result.parseState = clone_parse_state(arg->res->parseState); + result = arg->pstate; pushJsonbValue(&result, WJB_END_ARRAY, NULL); + /* We expect result.parseState == NULL after closing the array */ + Assert(result.parseState == NULL); + out = JsonbValueToJsonb(result.result); PG_RETURN_POINTER(out); @@ -1657,18 +1566,10 @@ static Datum jsonb_object_agg_transfn_worker(FunctionCallInfo fcinfo, bool absent_on_null, bool unique_keys) { - MemoryContext oldcontext, - aggcontext; - JsonbInState elem; + MemoryContext aggcontext; JsonbAggState *state; Datum val; JsonbInState *result; - bool single_scalar; - JsonbIterator *it; - Jsonb *jbkey, - *jbval; - JsonbValue v; - JsonbIteratorToken type; bool skip; if (!AggCheckCallContext(fcinfo, &aggcontext)) @@ -1683,16 +1584,13 @@ jsonb_object_agg_transfn_worker(FunctionCallInfo fcinfo, { Oid arg_type; - oldcontext = MemoryContextSwitchTo(aggcontext); - state = palloc(sizeof(JsonbAggState)); - result = palloc0(sizeof(JsonbInState)); - state->res = result; + state = MemoryContextAllocZero(aggcontext, sizeof(JsonbAggState)); + result = &state->pstate; + result->outcontext = aggcontext; pushJsonbValue(result, WJB_BEGIN_OBJECT, NULL); result->parseState->unique_keys = unique_keys; result->parseState->skip_nulls = absent_on_null; - MemoryContextSwitchTo(oldcontext); - arg_type = get_fn_expr_argtype(fcinfo->flinfo, 1); if (arg_type == InvalidOid) @@ -1716,11 +1614,9 @@ jsonb_object_agg_transfn_worker(FunctionCallInfo fcinfo, else { state = (JsonbAggState *) PG_GETARG_POINTER(0); - result = state->res; + result = &state->pstate; } - /* turn the argument into jsonb in the normal function context */ - if (PG_ARGISNULL(1)) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), @@ -1735,133 +1631,22 @@ jsonb_object_agg_transfn_worker(FunctionCallInfo fcinfo, if (skip && !unique_keys) PG_RETURN_POINTER(state); + /* + * We run this code in the normal function context, so that we don't leak + * any cruft from datatype output functions and such into the aggcontext. + * But the "result" JsonbValue will be constructed in aggcontext, so that + * it remains available across calls. + */ val = PG_GETARG_DATUM(1); - memset(&elem, 0, sizeof(JsonbInState)); - - datum_to_jsonb_internal(val, false, &elem, state->key_category, + datum_to_jsonb_internal(val, false, result, state->key_category, state->key_output_func, true); - jbkey = JsonbValueToJsonb(elem.result); - val = PG_ARGISNULL(2) ? (Datum) 0 : PG_GETARG_DATUM(2); - memset(&elem, 0, sizeof(JsonbInState)); - - datum_to_jsonb_internal(val, PG_ARGISNULL(2), &elem, state->val_category, + datum_to_jsonb_internal(val, PG_ARGISNULL(2), result, state->val_category, state->val_output_func, false); - jbval = JsonbValueToJsonb(elem.result); - - it = JsonbIteratorInit(&jbkey->root); - - /* switch to the aggregate context for accumulation operations */ - - oldcontext = MemoryContextSwitchTo(aggcontext); - - /* - * keys should be scalar, and we should have already checked for that - * above when calling datum_to_jsonb, so we only need to look for these - * things. - */ - - while ((type = JsonbIteratorNext(&it, &v, false)) != WJB_DONE) - { - switch (type) - { - case WJB_BEGIN_ARRAY: - if (!v.val.array.rawScalar) - elog(ERROR, "unexpected structure for key"); - break; - case WJB_ELEM: - if (v.type == jbvString) - { - /* copy string values in the aggregate context */ - char *buf = palloc(v.val.string.len + 1); - - snprintf(buf, v.val.string.len + 1, "%s", v.val.string.val); - v.val.string.val = buf; - } - else - { - ereport(ERROR, - (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg("object keys must be strings"))); - } - pushJsonbValue(result, WJB_KEY, &v); - - if (skip) - { - v.type = jbvNull; - pushJsonbValue(result, WJB_VALUE, &v); - MemoryContextSwitchTo(oldcontext); - PG_RETURN_POINTER(state); - } - - break; - case WJB_END_ARRAY: - break; - default: - elog(ERROR, "unexpected structure for key"); - break; - } - } - - it = JsonbIteratorInit(&jbval->root); - - single_scalar = false; - - /* - * values can be anything, including structured and null, so we treat them - * as in json_agg_transfn, except that single scalars are always pushed as - * WJB_VALUE items. - */ - - while ((type = JsonbIteratorNext(&it, &v, false)) != WJB_DONE) - { - switch (type) - { - case WJB_BEGIN_ARRAY: - if (v.val.array.rawScalar) - single_scalar = true; - else - pushJsonbValue(result, type, NULL); - break; - case WJB_END_ARRAY: - if (!single_scalar) - pushJsonbValue(result, type, NULL); - break; - case WJB_BEGIN_OBJECT: - case WJB_END_OBJECT: - pushJsonbValue(result, type, NULL); - break; - case WJB_ELEM: - case WJB_KEY: - case WJB_VALUE: - if (v.type == jbvString) - { - /* copy string values in the aggregate context */ - char *buf = palloc(v.val.string.len + 1); - - snprintf(buf, v.val.string.len + 1, "%s", v.val.string.val); - v.val.string.val = buf; - } - else if (v.type == jbvNumeric) - { - /* same for numeric */ - v.val.numeric = - DatumGetNumeric(DirectFunctionCall1(numeric_uplus, - NumericGetDatum(v.val.numeric))); - } - pushJsonbValue(result, single_scalar ? WJB_VALUE : type, &v); - break; - default: - elog(ERROR, "unknown jsonb iterator token type"); - } - } - - MemoryContextSwitchTo(oldcontext); - PG_RETURN_POINTER(state); } @@ -1918,18 +1703,20 @@ jsonb_object_agg_finalfn(PG_FUNCTION_ARGS) arg = (JsonbAggState *) PG_GETARG_POINTER(0); /* - * We need to do a shallow clone of the argument's res field in case the - * final function is called more than once, so we avoid changing the - * aggregate state value. A shallow clone is sufficient as we aren't - * going to change any of the values, just add the final object end - * marker. + * The final function can be called more than once, so we must not change + * the stored JsonbValue data structure. Fortunately, the WJB_END_OBJECT + * action will only destructively change fields in the JsonbInState struct + * itself, so we can simply invoke pushJsonbValue on a local copy of that. + * (This technique results in running uniqueifyJsonbObject each time, but + * for now we won't bother trying to avoid that.) */ - memset(&result, 0, sizeof(JsonbInState)); - - result.parseState = clone_parse_state(arg->res->parseState); + result = arg->pstate; pushJsonbValue(&result, WJB_END_OBJECT, NULL); + /* We expect result.parseState == NULL after closing the object */ + Assert(result.parseState == NULL); + out = JsonbValueToJsonb(result.result); PG_RETURN_POINTER(out); -- 2.43.5 From 5c5948f61a156f3024bd76ee0ec6c2300b64e3f7 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Tue, 22 Jul 2025 11:11:07 -0400 Subject: [PATCH v1 3/3] Micro-optimize datatype conversions in datum_to_jsonb_internal. The general case for converting to a JSONB numeric value is to run the source datatype's output function and then numeric_in, but we can do substantially better than that for integer and numeric source values. This patch improves the speed of jsonb_agg by 30% for integer input, and nearly 2X for numeric input. Sadly, the obvious idea of using float4_numeric and float8_numeric to speed up those cases doesn't work: they are actually slower than the generic coerce-via-I/O method, and not by a small amount. They might round off differently than this code has historically done, too. Leave that alone pending possible changes in those functions. We can also do better than the existing code for text/varchar/bpchar source data; this optimization is similar to one that already exists in the json_agg() code. That saves 20% or so for such inputs. Also make a couple of other minor improvements, such as not giving JSONTYPE_CAST its own special case outside the switch when it could perfectly well be handled inside, and not using dubious string hacking to detect infinity and NaN results. --- src/backend/utils/adt/jsonb.c | 115 ++++++++++++++++++++++++---------- 1 file changed, 81 insertions(+), 34 deletions(-) diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c index abe0d620566..0c964052958 100644 --- a/src/backend/utils/adt/jsonb.c +++ b/src/backend/utils/adt/jsonb.c @@ -19,6 +19,7 @@ #include "libpq/pqformat.h" #include "miscadmin.h" #include "utils/builtins.h" +#include "utils/fmgroids.h" #include "utils/json.h" #include "utils/jsonb.h" #include "utils/jsonfuncs.h" @@ -631,7 +632,8 @@ datum_to_jsonb_internal(Datum val, bool is_null, JsonbInState *result, bool key_scalar) { char *outputstr; - bool numeric_error; + Numeric numeric_val; + bool numeric_to_string; JsonbValue jb; bool scalar_jsonb = false; @@ -656,9 +658,6 @@ datum_to_jsonb_internal(Datum val, bool is_null, JsonbInState *result, } else { - if (tcategory == JSONTYPE_CAST) - val = OidFunctionCall1(outfuncoid, val); - switch (tcategory) { case JSONTYPE_ARRAY: @@ -682,41 +681,73 @@ datum_to_jsonb_internal(Datum val, bool is_null, JsonbInState *result, } break; case JSONTYPE_NUMERIC: - outputstr = OidOutputFunctionCall(outfuncoid, val); if (key_scalar) { - /* always quote keys */ + /* always stringify keys */ + numeric_to_string = true; + numeric_val = NULL; /* pacify stupider compilers */ + } + else + { + Datum numd; + + switch (outfuncoid) + { + case F_NUMERIC_OUT: + numeric_val = DatumGetNumeric(val); + break; + case F_INT2OUT: + numeric_val = int64_to_numeric(DatumGetInt16(val)); + break; + case F_INT4OUT: + numeric_val = int64_to_numeric(DatumGetInt32(val)); + break; + case F_INT8OUT: + numeric_val = int64_to_numeric(DatumGetInt64(val)); + break; +#ifdef NOT_USED + + /* + * Ideally we'd short-circuit these two cases + * using float[48]_numeric. However, those + * functions are currently slower than the generic + * coerce-via-I/O approach. And they may round + * off differently. Until/unless that gets fixed, + * continue to use coerce-via-I/O for floats. + */ + case F_FLOAT4OUT: + numd = DirectFunctionCall1(float4_numeric, val); + numeric_val = DatumGetNumeric(numd); + break; + case F_FLOAT8OUT: + numd = DirectFunctionCall1(float8_numeric, val); + numeric_val = DatumGetNumeric(numd); + break; +#endif + default: + outputstr = OidOutputFunctionCall(outfuncoid, val); + numd = DirectFunctionCall3(numeric_in, + CStringGetDatum(outputstr), + ObjectIdGetDatum(InvalidOid), + Int32GetDatum(-1)); + numeric_val = DatumGetNumeric(numd); + break; + } + /* Must convert to string if it's Inf or NaN */ + numeric_to_string = (numeric_is_inf(numeric_val) || + numeric_is_nan(numeric_val)); + } + if (numeric_to_string) + { + outputstr = OidOutputFunctionCall(outfuncoid, val); jb.type = jbvString; jb.val.string.len = strlen(outputstr); jb.val.string.val = outputstr; } else { - /* - * Make it numeric if it's a valid JSON number, otherwise - * a string. Invalid numeric output will always have an - * 'N' or 'n' in it (I think). - */ - numeric_error = (strchr(outputstr, 'N') != NULL || - strchr(outputstr, 'n') != NULL); - if (!numeric_error) - { - Datum numd; - - jb.type = jbvNumeric; - numd = DirectFunctionCall3(numeric_in, - CStringGetDatum(outputstr), - ObjectIdGetDatum(InvalidOid), - Int32GetDatum(-1)); - jb.val.numeric = DatumGetNumeric(numd); - pfree(outputstr); - } - else - { - jb.type = jbvString; - jb.val.string.len = strlen(outputstr); - jb.val.string.val = outputstr; - } + jb.type = jbvNumeric; + jb.val.numeric = numeric_val; } break; case JSONTYPE_DATE: @@ -738,6 +769,9 @@ datum_to_jsonb_internal(Datum val, bool is_null, JsonbInState *result, jb.val.string.len = strlen(jb.val.string.val); break; case JSONTYPE_CAST: + /* cast to JSON, and then process as JSON */ + val = OidFunctionCall1(outfuncoid, val); + /* FALL THROUGH */ case JSONTYPE_JSON: { /* parse the json right into the existing result object */ @@ -793,11 +827,24 @@ datum_to_jsonb_internal(Datum val, bool is_null, JsonbInState *result, } break; default: - outputstr = OidOutputFunctionCall(outfuncoid, val); + /* special-case text types to save useless palloc/memcpy ops */ + if (outfuncoid == F_TEXTOUT || + outfuncoid == F_VARCHAROUT || + outfuncoid == F_BPCHAROUT) + { + text *txt = DatumGetTextPP(val); + + jb.val.string.len = VARSIZE_ANY_EXHDR(txt); + jb.val.string.val = VARDATA_ANY(txt); + } + else + { + outputstr = OidOutputFunctionCall(outfuncoid, val); + jb.val.string.len = strlen(outputstr); + jb.val.string.val = outputstr; + } jb.type = jbvString; - jb.val.string.len = strlen(outputstr); (void) checkStringLen(jb.val.string.len, NULL); - jb.val.string.val = outputstr; break; } } -- 2.43.5 drop table if exists data_table; create table data_table (f1 text); insert into data_table select x::text from generate_series(1,10000) x; vacuum analyze data_table; \timing on do $$ begin for i in 1..10000 loop perform json_agg(f1) from data_table; end loop; end$$; do $$ begin for i in 1..10000 loop perform jsonb_agg(f1) from data_table; end loop; end$$;
pgsql-hackers by date: