Re: Extract numeric filed in JSONB more effectively - Mailing list pgsql-hackers

From Andy Fan
Subject Re: Extract numeric filed in JSONB more effectively
Date
Msg-id CAKU4AWpM=C5NxumoHQB1=id0OBc63kUgMUWyRNO2=OrHj6ROyQ@mail.gmail.com
Whole thread Raw
In response to Re: Extract numeric filed in JSONB more effectively  (Chapman Flack <chap@anastigmatix.net>)
Responses Re: Extract numeric filed in JSONB more effectively
List pgsql-hackers
Hi Chap,

The v11 attached, mainly changes are:
1.  use the jsonb_xx_start and jsonb_finish_numeric style. 
2.  improve the test case a bit.  

It doesn't include:
1.  the jsonb_finish_text function, since we have a operator ->> for text
already and the performance for it is OK and there is no cast entry for
jsonb to text. 
2.  the jsonb_finish_jsonb since I can't see a clear user case for now. 
Rewriting jsonb_object_field with 2 DirectFunctionCall looks not pretty
reasonable as we paid 2 DirectFunctionCall overhead to reduce ~10 lines
code duplication. 


An incompatible issue at error message level is found during test: 
create table jb(a jsonb);
insert into jb select '{"a": "a"}'::jsonb;
select (a->'a')::int4 from jb;

master:   ERROR:  cannot cast jsonb string to type integer
patch:  ERROR:  cannot cast jsonb string to type numeric

That's mainly because we first extract the field to numeric and
then cast it to int4 and the error raised at the first step and it
doesn't know the final type.  One way to fix it is adding a 2nd
argument for jsonb_finish_numeric for the real type, but
it looks weird and more suggestions on this would be good. 

Performance comparison between v10 and v11. 

create table tb (a jsonb);
insert into tb select '{"a": 1}'::jsonb from generate_series(1, 100000)i;
select 1 from tb where (a->'a')::int2 = 2;   (pgbench 5 times)

v11:  16.273 ms 
v10:  15.986 ms
master: 32.530ms

So I think the performance would not be an issue.  


I noticed there is another patch registered in this CF: [1]
It adds new operations within jsonpath like .bigint .time
and so on.

I was wondering whether that work would be conflicting or
complementary with this. It looks to be complementary. The
operations being added there are within jsonpath evaluation.
Here we are working on faster ways to get those results out.

It does not seem that [1] will add any new choices in
JsonbValue. All of its (.bigint .integer .number) seem to
verify the requested form and then put the result as a
numeric in ->val.numeric. So that doesn't add any new
cases for this patch to handle. (Too bad, in a way: if that
other patch added ->val.bigint, this patch could add a case
to retrieve that value without going through the work of
making a numeric. But that would complicate other things
touching JsonbValue, and be a matter for that other patch.)

It may be expanding the choices for what we might one day
find in ->val.datetime though.

Thanks for this information. I tried the  jsonb_xx_start and
jsonb_finish_numeric style, and it looks like a good experience 
and it may not make things too complicated even if the above 
things happen IMO. 

Any feedback is welcome. 

--
Best Regards
Andy Fan
Attachment

pgsql-hackers by date:

Previous
From: Jelte Fennema
Date:
Subject: Re: Allow specifying a dbname in pg_basebackup connection string
Next
From: Jelte Fennema
Date:
Subject: pg_basebackup: Always return valid temporary slot names