Re: json_to_recordset() and CTE performance - Mailing list pgsql-general

From Michael Lewis
Subject Re: json_to_recordset() and CTE performance
Date
Msg-id CAHOFxGp4j0-ELELNpwS3ONTWprqwn4qjz9QotsLiw0ekCnnwOQ@mail.gmail.com
Whole thread Raw
In response to json_to_recordset() and CTE performance  (Matt DeLuco <matt@deluco.net>)
Responses Re: json_to_recordset() and CTE performance  (Matt DeLuco <matt@deluco.net>)
List pgsql-general
Version? What is the value for work_mem and other configs that are non-default? I see some estimates that are rather off like -

            ->  Nested Loop  (cost=0.26..4.76 rows=100 width=148) (actual time=183.906..388716.550 rows=8935 loops=1)
                  Buffers: shared hit=53877 dirtied=2
                  ->  Function Scan on json_to_recordset x  (cost=0.01..1.00 rows=100 width=128) (actual time=130.645..142.316 rows=8935 loops=1)
                  ->  Function Scan on get_transaction_type_by_id bank_transaction_type  (cost=0.25..0.26 rows=1 width=4) (actual time=0.154..0.156 rows=1 loops=8935)
                        Buffers: shared hit=18054

Sometimes putting data into a temp table and analyzing it can be rather helpful to ensure the planner has statistics on the number of records, ndistinct, most common values, etc. I would try doing that with the result of json_to_recordset and skipping the function call to get_transaction_type_by_id until later, just to see how it performs.

That said, it seems like a hardcoded estimate from json_to_recordset is 100 perhaps. I haven't checked source code, but I know when defining a set returning function, there is a ROWS option which provides the planner a static value to assume will come out of that function so it would make sense perhaps.

pgsql-general by date:

Previous
From: "Ireneusz Pluta/wp.pl"
Date:
Subject: Re: using psql 11.4 with a server 13.0 && meta commands
Next
From: Matt DeLuco
Date:
Subject: Re: json_to_recordset() and CTE performance