Re: [HACKERS] [PATCH] Generic type subscripting - Mailing list pgsql-hackers
From | Dmitry Dolgov |
---|---|
Subject | Re: [HACKERS] [PATCH] Generic type subscripting |
Date | |
Msg-id | 20191110123208.r4kn5k34wsey5zik@localhost Whole thread Raw |
In response to | Re: [HACKERS] [PATCH] Generic type subscripting (Nikita Glukhov <n.gluhov@postgrespro.ru>) |
Responses |
Re: [HACKERS] [PATCH] Generic type subscripting
|
List | pgsql-hackers |
> On Thu, Oct 31, 2019 at 05:35:28AM +0300, Nikita Glukhov wrote: > Hi. I added new 5th patch to this patch set. Thank you! > Performance was compared on 4 tables with 10M rows: > > -- [ i ] > CREATE TABLE arr_1 AS > SELECT jsonb_build_array(i)::jsonb js > FROM generate_series(1, 10000000) i; > > -- { "a": i } > CREATE TABLE obj_1 AS > SELECT jsonb_build_object('a', i) js > FROM generate_series(1, 10000000) i; > > -- [[[[[[[[[[ i ]]]]]]]]]] > CREATE TABLE arr_10 AS > SELECT (repeat('[', 10) || i || repeat(']', 10))::jsonb js > FROM generate_series(1, 10000000) i; > > -- {"a": {"a": ... {"a": {"a": i } } ... } } > CREATE TABLE obj_10 AS > SELECT (repeat('{"a":', 10) || i || repeat('}', 10))::jsonb js > FROM generate_series(1, 10000000) i; > > Queries were like "SELECT FROM table WHERE expression IS [NOT] NULL". > > Compared previous v27 version (4 patches) with v28 version (5 patches). > New patch #5 contains one small but important optimization -- elimination of > unnecessary palloc() in getIthJsonbValueFromContainer() and jsonb_get_element(). > It should be posted separately, but for simplicity I included it the patch now. > For the correctness of comparison, it was evaluated separately on top of v27 > (v27opt). > > > Table | Expression | Query time, ms > | | v27 | v27opt| v28 > --------+--------------------------------------+-------+-------+------- > arr_1 | js->0 | 1811 | 1809 | 1813 > arr_1 | js[0] | 2273 | 2294 | 2028 > arr_1 | js['0'] | 2276 | 2286 | 2339 > arr_1 | js->1 | 808 | 844 | 809 > arr_1 | js[1] | 1180 | 1187 | 1008 > obj_1 | js->'a' | 1941 | 1935 | 1939 > obj_1 | js['a'] | 2079 | 2083 | 2102 > obj_1 | js->'b' | 917 | 915 | 902 > obj_1 | js['b'] | 960 | 961 | 1059 > | > arr_10 | js->0->0 ... ->0->0 | 4530 | 4068 | 4052 > arr_10 | js[0][0] ... [0][0] | 6197 | 5513 | 3766 > arr_10 | js['0']['0'] ... ['0']['0'] | 6202 | 5519 | 5983 > arr_10 | js #> '{0,0,0,0,0,0,0,0,0,0}' | 6412 | 5850 | 5835 > arr_10 | js #>> '{0,0,0,0,0,0,0,0,0,0}' | 5904 | 5181 | 5192 > > obj_10 | js->'a'->'a' ... ->'a'->'a' | 4970 | 4717 | 4704 > obj_10 | js['a']['a'] ... ['a']['a'] | 4331 | 3698 | 4032 > obj_10 | js #> '{a,a,a,a,a,a,a,a,a,a}' | 4570 | 3941 | 3949 > obj_10 | js #>> '{a,a,a,a,a,a,a,a,a,a}' | 4055 | 3395 | 3392 > > As it can be seen, array access time reduced from 10% in single subscripts > to 40% in 10-subscript chains, and subscripting event started to overtake > chained "->" operators. But there is 10% slowdown of object key access that > needs further investigation. The elimination of unnecessary palloc()s also > gives good results. I've tested 5th patch a bit and can confirm numbers in the last column for v28 (I've got similar proportions). Let's see what is the reason for 10% of slowdown for object key access. > I had to write new assignment logic reusing only some parts of setPath(), > because the loop in setPath() should be broken on every level. During this > process, I decided to implement assignment behavior similar to PostgreSQL's > array behavior and added two new features: > - creation of jsonb arrays/objects container from NULL values > - appending/prepending array elements on the specified position, gaps filled > with nulls (JavaScript has similar behavior) What is the reason for the last one?
pgsql-hackers by date: