Re: Is it possible to index "deep" into a JSONB column? - Mailing list pgsql-general
From | Shaheed Haque |
---|---|
Subject | Re: Is it possible to index "deep" into a JSONB column? |
Date | |
Msg-id | CAHAc2jdnrMe0N6hTjM2tzWR9C2m4MiqpdqJ0P36Bjm-6WCAxtA@mail.gmail.com Whole thread Raw |
In response to | Re: Is it possible to index "deep" into a JSONB column? (Imre Samu <pella.samu@gmail.com>) |
Responses |
Re: Is it possible to index "deep" into a JSONB column?
|
List | pgsql-general |
Imre, I'm gradually working my way into the combination of SQL, JSONB and jsonpath that this involves even without the indexing, but this looks very helpful/promising, especially on the indexing. Thanks a lot for the input, Shaheed On Mon, 30 May 2022 at 11:50, Imre Samu <pella.samu@gmail.com> wrote: > > > In principle, I believe this allows index-assisted access to keys and > > values nested in arrays and inner objects but in practice, it seems > > the planner "often" decides to ignore the index in favour of a table scan. > > part II. index usage ; see the "Bitmap Index Scan on jpqarr_idx" > > SET enable_seqscan = OFF; > select * from jsonb_table > where jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth') @> to_jsonb('1971-02-02'::TEXT); > ; > +----+---------------------------------------------------------------------------------------------------------------+ > | id | jsonb_col | > +----+---------------------------------------------------------------------------------------------------------------+ > | 2 | {"stuff": {}, "employee": {"7011": {"date_of_birth": "1970-01-01"}, "7122": {"date_of_birth": "1971-02-02"}}} | > +----+---------------------------------------------------------------------------------------------------------------+ > (1 row) > > > EXPLAIN ANALYZE select * from jsonb_table > where jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth') @> to_jsonb('1971-02-02'::TEXT); > ; > +---------------------------------------------------------------------------------------------------------------------------------------------------------+ > | QUERY PLAN | > +---------------------------------------------------------------------------------------------------------------------------------------------------------+ > | Bitmap Heap Scan on jsonb_table (cost=3.00..4.52 rows=1 width=36) (actual time=0.056..0.059 rows=1 loops=1) | > | Recheck Cond: (jsonb_path_query_array(jsonb_col, '$."employee".*."date_of_birth"'::jsonpath, '{}'::jsonb, false) @>to_jsonb('1971-02-02'::text)) | > | Heap Blocks: exact=1 | > | -> Bitmap Index Scan on jpqarr_idx (cost=0.00..3.00 rows=1 width=0) (actual time=0.026..0.027 rows=1 loops=1) | > | Index Cond: (jsonb_path_query_array(jsonb_col, '$."employee".*."date_of_birth"'::jsonpath, '{}'::jsonb, false)@> to_jsonb('1971-02-02'::text)) | > | Planning Time: 0.255 ms | > | Execution Time: 0.122 ms | > +---------------------------------------------------------------------------------------------------------------------------------------------------------+ > (7 rows) > > regards, > Imre > > > Imre Samu <pella.samu@gmail.com> ezt írta (időpont: 2022. máj. 30., H, 12:30): >> >> Hi Shaheed, >> >> > WHAT GOES HERE >> >> imho check the: jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth' ) >> >> may example: >> >> CREATE TABLE jsonb_table ( >> id serial primary key, >> jsonb_col JSONB >> ); >> >> INSERT INTO jsonb_table(jsonb_col) >> VALUES >> ('{"stuff": {},"employee": {"8011": {"date_of_birth": "1980-01-01"},"8222": {"date_of_birth": "1982-02-02"}}}'), >> ('{"stuff": {},"employee": {"7011": {"date_of_birth": "1970-01-01"},"7122": {"date_of_birth": "1971-02-02"}}}'), >> ('{"stuff": {},"employee": {"a12": {"date_of_birth": "2000-01-01"},"b56": {"date_of_birth": "2000-02-02"}}}') >> ; >> >> select jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth' ) from jsonb_table; >> >> -- create index >> create index jpqarr_idx >> on jsonb_table >> using gin ( jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth' ) ); >> -- tests: >> select id from jsonb_table >> where jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth') @> to_jsonb('2000-01-01'::TEXT); >> ; >> DROP TABLE >> CREATE TABLE >> INSERT 0 3 >> +------------------------------+ >> | jsonb_path_query_array | >> +------------------------------+ >> | ["1980-01-01", "1982-02-02"] | >> | ["1970-01-01", "1971-02-02"] | >> | ["2000-01-01", "2000-02-02"] | >> +------------------------------+ >> (3 rows) >> >> CREATE INDEX >> +----+ >> | id | >> +----+ >> | 3 | >> +----+ >> (1 row) >> >> >> Regards, >> Imre >> >> Shaheed Haque <shaheedhaque@gmail.com> ezt írta (időpont: 2022. máj. 29., V, 22:53): >>> >>> Suppose I have a JSONB field called "snapshot". I can create a GIN >>> index on it like this: >>> >>> create index idx1 on mytable using gin (snapshot); >>> >>> In principle, I believe this allows index-assisted access to keys and >>> values nested in arrays and inner objects but in practice, it seems >>> the planner "often" decides to ignore the index in favour of a table >>> scan. (As discussed elsewhere, this is influenced by the number of >>> rows, and possibly other criteria too). >>> >>> Now, I know it is possible to index inner objects, so that is snapshot >>> looks like this: >>> >>> { >>> "stuff": {}, >>> "more other stuff": {}, >>> "employee": { >>> "1234": {"date_of_birth": "1970-01-01"}, >>> "56B789": {"date_of_birth": "1971-02-02"}, >>> } >>> } >>> >>> I can say: >>> >>> create index idx2 on mytable using gin ((snapshot -> 'employee')); >>> >>> But what is the syntax to index only on date_of_birth? I assume a >>> btree would work since it is a primitive value, but WHAT GOES HERE in >>> this: >>> >>> create index idx3 on mytable using btree ((snapshot ->'employee' -> >>> WHAT GOES HERE -> 'date_of_birth')); >>> >>> I believe an asterisk "*" would work if 'employee' was an array, but >>> here it is nested object with keys. If it helps, the keys are >>> invariably numbers (in quoted string form, as per JSON). >>> >>> Thanks, Shaheed >>> >>>
pgsql-general by date: