Optimization of some jsonb functions - Mailing list pgsql-hackers

From Nikita Glukhov
Subject Optimization of some jsonb functions
Date
Msg-id 7c417f90-f95f-247e-ba63-d95e39c0ad14@postgrespro.ru
Whole thread Raw
Responses Re: Optimization of some jsonb functions  (David Steele <david@pgmasters.net>)
List pgsql-hackers
Attached set of patches with some jsonb optimizations that were made during
comparison of performance of ordinal jsonb operators and jsonpath operators.

1. Optimize JsonbExtractScalar():  It is better to use getIthJsonbValueFromContainer(cont, 0) instead of  JsonIterator to get 0th element of raw-scalar pseudoarray.  JsonbExtractScalar() is used in jsonb casts, so they speed up a bit.

2. Optimize operator #>>, jsonb_each_text(), jsonb_array_elements_text():  These functions have direct conversion (JsonbValue => text) only for  jbvString scalars, but indirect conversion of other types of scalars   (JsonbValue => jsonb => text) is obviously too slow.  Extracted common  subroutine JsonbValueAsText() and used in all suitable places.

3. Optimize JsonbContainer type recognition in get_jsonb_path_all():  Fetching of the first token from JsonbIterator is replaced with lightweight  JsonbContainerIsXxx() macros.

4. Extract findJsonbKeyInObject():  Extracted findJsonbKeyInObject() from findJsonbValueFromContainer(),  which is slightly easier to use (key string and its length is passed instead  of filled string JsonbValue).

5. Optimize resulting value allocation in findJsonbValueFromContainer() and  getIthJsonbValueFromContainer():  Added ability to pass stack-allocated JsonbValue that will be filled with  the result of operation instead of returning unconditionally palloc()ated  JsonbValue.

Patches #4 and #5 are mostly refactorings, but they can give small speedup
(up to 5% for upcoming jsonpath operators) due to elimination of unnecessary 
palloc()s.  The whole interface of findJsonbValueFromContainer() with JB_OBJECT
and JB_ARRAY flags always seemed a bit strange to me, so I think it is worth to
have separate functions for searching keys in objects and elements in arrays.


Performance tests:- Test data for {"x": {"y": {"z": i}}}:  CREATE TABLE t AS  SELECT jsonb_build_object('x',           jsonb_build_object('y',             jsonb_build_object('z', i))) js  FROM generate_series(1, 3000000) i;
- Sample query:  EXPLAIN (ANALYZE) SELECT js -> 'x' -> 'y' -> 'z' FROM t;
- Results:                                                   |   execution time, ms                        query                      |  master  |   optimized   
-------------------------------------------------------------------------------             {"x": {"y": {"z": i}}}js #> '{x,y,z}'                                    | 1148.632 | 1005.578 -10%js #>> '{x,y,z}'                                   | 1520.160 |  849.991 -40%(js #> '{x,y,z}')::numeric                         | 1310.881 | 1067.752 -20%(js #>> '{x,y,z}')::numeric                        | 1757.179 | 1109.495 -30%
js -> 'x' -> 'y' -> 'z'                            | 1030.211 |  977.267js -> 'x' -> 'y' ->> 'z'                           |  887.101 |  838.745(js -> 'x' -> 'y' -> 'z')::numeric                 | 1184.086 | 1050.462(js -> 'x' -> 'y' -> 'z')::int4                    | 1279.315 | 1133.032(js -> 'x' -> 'y' ->> 'z')::numeric                | 1134.003 | 1100.047(js -> 'x' -> 'y' ->> 'z')::int4                   | 1077.216 |  991.995
js ? 'x'                                           |  523.111 |  495.387js ?| '{x,y,z}'                                    |  612.880 |  607.455js ?& '{x,y,z}'                                    |  674.786 |  643.987js -> 'x' -> 'y' ? 'z'                             |  712.623 |  698.588js @> '{"x": {"y": {"z": 1}}}'                     | 1154.926 | 1149.069

jsonpath:js @@ '$.x.y.z == 123'                             |  973,444 |   912,08  -5%
             {"x": i, "y": i, "z": i}jsonb_each(js)                                     | 2281.577 | 2262.660jsonb_each_text(js)                                | 2603.539 | 2112.200 -20%                  [i, i, i]jsonb_array_elements(js)                           | 1255.210 | 1205.939jsonb_array_elements(js)::numeric                  | 1662.550 | 1576.227  -5%jsonb_array_elements_text(js)                      | 1555.021 | 1067.031 -30%
js @> '1'                                          |  798.858 |  768.664  -4%js <@ '[1,2,3]'                                    |  820.795 |  785.086  -5%js <@ '[0,1,2,3,4,5,6,7,8,9]'                      | 1214.170 | 1165.289  -5%


As it can be seen, #> operators are always slower than equivalent series of ->.
I think it is caused by array deconstruction in "jsonb #> text[]".

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: [PATCH v20] GSSAPI encryption support
Next
From: Bruce Momjian
Date:
Subject: Re: boolean and bool in documentation