JSON path - Mailing list pgsql-performance

From Jesper Pedersen
Subject JSON path
Date
Msg-id 71a1d71e-5263-2980-117c-066a37e481bc@redhat.com
Whole thread Raw
Responses Re: JSON path  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Hi,

We have a table which has a jsonb column in it. Each row contains a lot
of data in that column, so TOASTed.

We have to extract data from that column at different levels, so an
example query could look like

select
   col1,
   col2,
jsonb_path_query_first(data,'$.lvl1.lvl2.lvl3.lvl4.lvl5.val1."text()"')
as val1,
jsonb_path_query_first(data,'$.lvl1.lvl2.lvl3.lvl4.lvl5.val2."text()"')
as val2,
jsonb_path_query_first(data,'$.lvl1.lvl2.lvl3.lvl4.lvl5.val3."text()"')
as val3
from tbl
where
   id = 1;

I tried to rewrite it to

WITH foo AS (select
   id,
   col1,
   col2,
jsonb_path_query_first(data,'$.lvl1.lvl2.lvl3.lvl4.lvl5') as jsondata,
   from tbl )
select
   col1,
   col2,
   jsondata->val1->'text()' as val1,
   jsondata->val2->'text()' as val2,
   jsondata->val3->'text()' as val3
from foo
where
   id = 1;

However, WITH has the same run-time profile - most of the time is spent
in pglz_decompress. Using the -> notation has the same profile.

The more data I extract from the JSON object the slower the query gets.

Of course, if I change the column to EXTERNAL we see a ~3.5 x speedup in
the queries but disk space requirements goes up by too much.

(We need to use a jsonb column as the data is unstructured, and may
differ in structure between rows. Yes, yes, I know...)

PostgreSQL 12.x on RHEL.

If anybody has some good ideas it would be appreciated.

Thanks in advance !

Best regards,
  Jesper




pgsql-performance by date:

Previous
From: Luís Roberto Weck
Date:
Subject: Re: Parallel Query
Next
From: Tom Lane
Date:
Subject: Re: JSON path