Re: [HACKERS] [PATCH] Generic type subscripting - Mailing list pgsql-hackers

From Dmitry Dolgov
Subject Re: [HACKERS] [PATCH] Generic type subscripting
Date
Msg-id 20191110123208.r4kn5k34wsey5zik@localhost
Whole thread Raw
In response to Re: [HACKERS] [PATCH] Generic type subscripting  (Nikita Glukhov <n.gluhov@postgrespro.ru>)
Responses Re: [HACKERS] [PATCH] Generic type subscripting
List pgsql-hackers
> On Thu, Oct 31, 2019 at 05:35:28AM +0300, Nikita Glukhov wrote:
> Hi. I added new 5th patch to this patch set.

Thank you!

> Performance was compared on 4 tables with 10M rows:
>
> -- [ i ]
> CREATE TABLE arr_1 AS
> SELECT jsonb_build_array(i)::jsonb js
> FROM generate_series(1, 10000000) i;
>
> -- { "a": i }
> CREATE TABLE obj_1 AS
> SELECT jsonb_build_object('a', i)  js
> FROM generate_series(1, 10000000) i;
>
> -- [[[[[[[[[[ i ]]]]]]]]]]
> CREATE TABLE arr_10 AS
> SELECT (repeat('[', 10) || i || repeat(']', 10))::jsonb js
> FROM generate_series(1, 10000000) i;
>
> -- {"a": {"a": ... {"a": {"a": i } } ... } }
> CREATE TABLE obj_10 AS
> SELECT (repeat('{"a":', 10) || i || repeat('}', 10))::jsonb js
> FROM generate_series(1, 10000000) i;
>
> Queries were like "SELECT FROM table WHERE expression IS [NOT] NULL".
>
> Compared previous v27 version (4 patches) with v28 version (5 patches).
> New patch #5 contains one small but important optimization -- elimination of
> unnecessary palloc() in getIthJsonbValueFromContainer() and jsonb_get_element().
> It should be posted separately, but for simplicity I included it the patch now.
> For the correctness of comparison, it was evaluated separately on top of v27
> (v27opt).
>
>
>  Table  |               Expression             |      Query time, ms
>         |                                      |  v27  | v27opt|  v28
> --------+--------------------------------------+-------+-------+-------
>  arr_1  | js->0                                |  1811 |  1809 |  1813
>  arr_1  | js[0]                                |  2273 |  2294 |  2028
>  arr_1  | js['0']                              |  2276 |  2286 |  2339
>  arr_1  | js->1                                |   808 |   844 |   809
>  arr_1  | js[1]                                |  1180 |  1187 |  1008
>  obj_1  | js->'a'                              |  1941 |  1935 |  1939
>  obj_1  | js['a']                              |  2079 |  2083 |  2102
>  obj_1  | js->'b'                              |   917 |   915 |   902
>  obj_1  | js['b']                              |   960 |   961 |  1059
>         |
>  arr_10 | js->0->0 ... ->0->0                  |  4530 |  4068 |  4052
>  arr_10 | js[0][0] ... [0][0]                  |  6197 |  5513 |  3766
>  arr_10 | js['0']['0'] ... ['0']['0']          |  6202 |  5519 |  5983
>  arr_10 | js #>  '{0,0,0,0,0,0,0,0,0,0}'       |  6412 |  5850 |  5835
>  arr_10 | js #>> '{0,0,0,0,0,0,0,0,0,0}'       |  5904 |  5181 |  5192
>
>  obj_10 | js->'a'->'a' ... ->'a'->'a'          |  4970 |  4717 |  4704
>  obj_10 | js['a']['a'] ... ['a']['a']          |  4331 |  3698 |  4032
>  obj_10 | js #>  '{a,a,a,a,a,a,a,a,a,a}'       |  4570 |  3941 |  3949
>  obj_10 | js #>> '{a,a,a,a,a,a,a,a,a,a}'       |  4055 |  3395 |  3392
>
> As it can be seen, array access time reduced from 10% in single subscripts
> to 40% in 10-subscript chains, and subscripting event started to overtake
> chained "->" operators.  But there is 10% slowdown of object key access that
> needs further investigation.  The elimination of unnecessary palloc()s also
> gives good results.

I've tested 5th patch a bit and can confirm numbers in the last column
for v28 (I've got similar proportions). Let's see what is the reason for 10%
of slowdown for object key access.

> I had to write new assignment logic reusing only some parts of setPath(),
> because the loop in setPath() should be broken on every level.  During this
> process, I decided to implement assignment behavior similar to PostgreSQL's
> array behavior and added two new features:
>  - creation of jsonb arrays/objects container from NULL values
>  - appending/prepending array elements on the specified position, gaps filled
>    with nulls (JavaScript has similar behavior)

What is the reason for the last one?



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Ordering of header file inclusion
Next
From: Tom Lane
Date:
Subject: Re: [bug fix] Produce a crash dump before main() on Windows