Re: json_to_recordset() and CTE performance - Mailing list pgsql-general

From Matt DeLuco
Subject Re: json_to_recordset() and CTE performance
Date
Msg-id 06F31D4C-094D-4572-8850-AF1DA7519304@deluco.net
Whole thread Raw
In response to Re: json_to_recordset() and CTE performance  (Michael Lewis <mlewis@entrata.com>)
List pgsql-general
PostgreSQL 13.0.

You’d have to be specific about the configs you’re looking for, I’m using Postgres.app (postgresapp.com) and am uncertain if it’s distributed with non-default configs.

But, a quick grep shows these items that are configured:
max_wal_size = 1GB
min_wal_size = 80MB
shared_buffers = 128MB

work_mem is not configured so presumably it’s the default of 4MB.

I’ll try the temp tables. That seems familiar to what I found searching online - are you suggesting that as a permanent solution, or just as a means to better analyze performance?

Thanks,

Matt

On Oct 21, 2020, at 1:25 PM, Michael Lewis <mlewis@entrata.com> wrote:

Version? What is the value for work_mem and other configs that are non-default? I see some estimates that are rather off like -

            ->  Nested Loop  (cost=0.26..4.76 rows=100 width=148) (actual time=183.906..388716.550 rows=8935 loops=1)
                  Buffers: shared hit=53877 dirtied=2
                  ->  Function Scan on json_to_recordset x  (cost=0.01..1.00 rows=100 width=128) (actual time=130.645..142.316 rows=8935 loops=1)
                  ->  Function Scan on get_transaction_type_by_id bank_transaction_type  (cost=0.25..0.26 rows=1 width=4) (actual time=0.154..0.156 rows=1 loops=8935)
                        Buffers: shared hit=18054

Sometimes putting data into a temp table and analyzing it can be rather helpful to ensure the planner has statistics on the number of records, ndistinct, most common values, etc. I would try doing that with the result of json_to_recordset and skipping the function call to get_transaction_type_by_id until later, just to see how it performs.

That said, it seems like a hardcoded estimate from json_to_recordset is 100 perhaps. I haven't checked source code, but I know when defining a set returning function, there is a ROWS option which provides the planner a static value to assume will come out of that function so it would make sense perhaps.

pgsql-general by date:

Previous
From: Michael Lewis
Date:
Subject: Re: json_to_recordset() and CTE performance
Next
From: alanhi
Date:
Subject: Setup Pgpool2 with Postgresql Streaming Replication