Re: Performance of JSON type in postgres - Mailing list pgsql-general

From veem v
Subject Re: Performance of JSON type in postgres
Date
Msg-id CAB+=1TWwerQMVQvOP0oAVunTUz8tnX425Jn-84DBYbOD+us9Eg@mail.gmail.com
Whole thread Raw
In response to Re: Performance of JSON type in postgres  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: Performance of JSON type in postgres
List pgsql-general

On Tue, 15 Jul 2025 at 23:02, Merlin Moncure <mmoncure@gmail.com> wrote:
On Mon, Jul 14, 2025 at 2:01 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Mon, Jul 14, 2025 at 12:54 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 7/14/25 12:51, veem v wrote:
So I want to
> understand the experts' opinion on this which I believe will  be
> crucial during design itself.

It is spelled out here:

https://www.postgresql.org/docs/current/datatype-json.html


I've taken to heart the main takeaway from that page:

"In general, most applications should prefer to store JSON data as jsonb, unless there are quite specialized needs, such as legacy assumptions about ordering of object keys."

I don't think the documentation is accurate at all, unless one of those specialized needs is to 'be faster'.   json serialization is more than 2x faster based on simple testing (see below).   This is absolutely not a trivial difference.

I would say, use json for serialization, use jsonb for data storage, unless the precise structure of the input document is important.

merlin

leaselock_iam@leaselock_prod=> explain analyze select json_agg(l) from ( select l from llcore.lease l limit 10000) q;
                                                          QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=405.52..405.53 rows=1 width=32) (actual time=69.043..69.048 rows=1 loops=1)
   ->  Limit  (cost=0.00..380.52 rows=10000 width=247) (actual time=0.017..9.764 rows=10000 loops=1)
         ->  Seq Scan on lease l  (cost=0.00..100383.89 rows=2638089 width=247) (actual time=0.016..8.831 rows=10000 loops=1)
 Planning Time: 0.109 ms
 Execution Time: 69.088 ms
(5 rows)

Time: 160.560 ms
leaselock_iam@leaselock_prod=> explain analyze select jsonb_agg(l) from ( select l from llcore.lease l limit 10000) q;
                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=405.52..405.53 rows=1 width=32) (actual time=146.139..146.141 rows=1 loops=1)
   ->  Limit  (cost=0.00..380.52 rows=10000 width=247) (actual time=0.017..20.837 rows=10000 loops=1)
         ->  Seq Scan on lease l  (cost=0.00..100383.89 rows=2638089 width=247) (actual time=0.016..19.975 rows=10000 loops=1)
 Planning Time: 0.108 ms
 Execution Time: 152.277 ms
 


Thank you.

I tested below for sample data. I see loading or serialization seems a lot slower(twice as slower) in JSONB as compared to JSON. Whereas storage looks efficient in JSONB. and reading performance of nested fields are 7-8 times slower in JSON as compared to JSONB(and ofcourse index support makes it a better choice here). Hope i am testing it correctly here.
So I am a bit confused here . Also one of our use case is, along with persisting this data and querying it in postgres database, We are also going to move this data from postgres (which is a upstream OLTP system) to a downstream OLAP system ,which is in Snowflake database which is having data types like Variant or Varchar types. So, will it create a significant difference if we store it in JSON vs JSONB in our postgres i.e the source/upstream database? 

pgsql-general by date:

Previous
From: Jon Zeppieri
Date:
Subject: Possible causes of high_replay lag, given replication settings?
Next
From: Adrian Klaver
Date:
Subject: Re: Performance of JSON type in postgres