Re: to_jsonb performance on array aggregated correlated subqueries - Mailing list pgsql-performance

From Rick Otten
Subject Re: to_jsonb performance on array aggregated correlated subqueries
Date
Msg-id CAMAYy4KZEVsxCdz+iQEw4vOEABOD2psjnPAGQEYHg_6EMkQ8VA@mail.gmail.com
Whole thread Raw
In response to to_jsonb performance on array aggregated correlated subqueries  (Nico Heller <nico.heller@posteo.de>)
List pgsql-performance


On Fri, Aug 12, 2022 at 3:02 PM Rick Otten <rottenwindfish@gmail.com> wrote:


On Fri, Aug 12, 2022 at 2:50 PM Nico Heller <nico.heller@posteo.de> wrote:
Good day,

consider the following query:

WITH aggregation(
     SELECT
            a.*,
           (SELECT array_agg(b.*) FROM b WHERE b.a_id = a.id) as "bs",
           (SELECT array_agg(c.*) FROM c WHERE c.a_id = a.id) as "cs",
           (SELECT array_agg(d.*) FROM d WHERE d.a_id = a.id) as "ds",
           (SELECT array_agg(e.*) FROM d WHERE e.a_id = a.id) as "es"
     FROM a WHERE a.id IN (<some big list, ranging from 20-180 entries)
)
SELECT to_jsonb(aggregation.*) as "value" FROM aggregation;


- You do have an index on `b.a_id` and `c.a_id`, etc... ?  You didn't say...

- Are you sure it is the `to_jsonb` that is making this query slow?

- Since you are serializing this for easy machine readable consumption outside of the database, does it make a difference if you use `to_json` instead?


To follow up here a little.  I ran some quick tests on my database and found that `to_json` is consistently, slightly, faster than `to_jsonb` when you are just serializing the result set for consumption.   I feed in some arrays of 1,000,000 elements for testing.  While both json serializers are slower than just sending back the result set, it wasn't significant on my machine with simple object types.  (3% slower).

Are any of your objects in "b.*", etc, complex data structures or deeper arrays, or gis shapes, or strange data types that might be hard to serialize?  I'm wondering if there is something hidden in those ".*" row sets that are particularly problematic and compute intensive to process.

pgsql-performance by date:

Previous
From: Nico Heller
Date:
Subject: Re: to_jsonb performance on array aggregated correlated subqueries
Next
From: Kevin McKibbin
Date:
Subject: pgbench: could not connect to server: Resource temporarily unavailable