Re: JSON path - Mailing list pgsql-performance

From Tom Lane
Subject Re: JSON path
Date
Msg-id 4339.1573754676@sss.pgh.pa.us
Whole thread Raw
In response to JSON path  (Jesper Pedersen <jesper.pedersen@redhat.com>)
Responses Re: JSON path  (Jesper Pedersen <jesper.pedersen@redhat.com>)
List pgsql-performance
Jesper Pedersen <jesper.pedersen@redhat.com> writes:
> We have a table which has a jsonb column in it. Each row contains a lot
> of data in that column, so TOASTed.
> We have to extract data from that column at different levels, so an
> example query could look like

> select
>    col1,
>    col2,
> jsonb_path_query_first(data,'$.lvl1.lvl2.lvl3.lvl4.lvl5.val1."text()"')
> as val1,
> jsonb_path_query_first(data,'$.lvl1.lvl2.lvl3.lvl4.lvl5.val2."text()"')
> as val2,
> jsonb_path_query_first(data,'$.lvl1.lvl2.lvl3.lvl4.lvl5.val3."text()"')
> as val3
> from tbl
> where
>    id = 1;

Right ...

> I tried to rewrite it to

> WITH foo AS (select
>    id,
>    col1,
>    col2,
> jsonb_path_query_first(data,'$.lvl1.lvl2.lvl3.lvl4.lvl5') as jsondata,
>    from tbl )
> select
>    col1,
>    col2,
>    jsondata->val1->'text()' as val1,
>    jsondata->val2->'text()' as val2,
>    jsondata->val3->'text()' as val3
> from foo
> where
>    id = 1;

This has got syntax errors, but I get the point.

> However, WITH has the same run-time profile - most of the time is spent
> in pglz_decompress. Using the -> notation has the same profile.

As of v12, that WITH will get flattened, so that you still end up
with three invocations of jsonb_path_query_first, as EXPLAIN VERBOSE
will show you.  You could write "WITH foo AS MATERIALIZED ..." to
prevent that, but then you'll need to stick the WHERE clause inside
the WITH or you'll end up running jsonb_path_query_first for every
row of tbl.

With

explain verbose WITH foo AS materialized (select
   id,
   col1,
   col2,
jsonb_path_query_first(data,'$.lvl1.lvl2.lvl3.lvl4.lvl5') as jsondata
   from tbl where id = 1 )
select
   col1,
   col2,
   jsondata->'val1'->'text()' as val1,
   jsondata->'val2'->'text()' as val2,
   jsondata->'val3'->'text()' as val3
from foo;

I get a plan that does what you're looking for:

        QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on foo  (cost=24.14..24.35 rows=6 width=104)
   Output: foo.col1, foo.col2, ((foo.jsondata -> 'val1'::text) -> 'text()'::text), ((foo.jsondata -> 'val2'::text) ->
'text()'::text),((foo.jsondata -> 'val3'::text) -> 'text()'::text) 
   CTE foo
     ->  Seq Scan on public.tbl  (cost=0.00..24.14 rows=6 width=44)
           Output: tbl.id, tbl.col1, tbl.col2, jsonb_path_query_first(tbl.data,
'$."lvl1"."lvl2"."lvl3"."lvl4"."lvl5"'::jsonpath,'{}'::jsonb, false) 
           Filter: (tbl.id = 1)
(6 rows)

            regards, tom lane



pgsql-performance by date:

Previous
From: Jesper Pedersen
Date:
Subject: JSON path
Next
From: Jesper Pedersen
Date:
Subject: Re: JSON path