Bad planner performance for tables with empty tuples when using JIT - Mailing list pgsql-general

From Jurrie Overgoor
Subject Bad planner performance for tables with empty tuples when using JIT
Date
Msg-id 70ec49f7-e9c4-bea7-c689-a4dfbdd66f78@jurr.org
Whole thread Raw
Responses Re: Bad planner performance for tables with empty tuples when using JIT  (Michael Lewis <mlewis@entrata.com>)
List pgsql-general
Hello everyone,

I'm currently in the process of upgrading our PostgreSQL installations 
from 9.6 to 13. I am experiencing very slow query performance for empty 
tables.

Our test environments get build from scratch every run, and thus contain 
a lot of empty tables at first. We hit the issue discussed and resolved 
in [1]. Problem is, this fix is not included in PG13... but JIT is :)

Some of our queries join a lot of tables, so the planner thinks this 
will result in very high costs, and the JIT feature kicks in. This gives 
bad performance. Executing the query is very fast of course; it's just 
the planning stage that is time consuming. We do not see this behavior 
in PG9.6, as there is no JIT. EXPLAIN gives the same high costs on PG9.6 
and PG13, but on PG9.6 the planning time is low enough.

Now, how can I circumvent this?

- I could wait for PG14. I verified that PG14 solves my case and queries 
are fast. But this would take at least until the third quarter of 2021, 
so the website tells me.

- I could ask for a back-port of [2]. The commit is API breaking; is 
this even a feasible option?

- I could turn off JIT in the server config, but I'd like to use the JIT 
feature where it's appropriate!

- I could turn off JIT using a GUC prior to query execution. But then I 
would need to detect the cases where I need to do this. (I.e. cases 
where I query a table that has no tuples.) It would be very cumbersome 
for me to write this in my application, and I feel this is more the 
responsibility of the database than of my program.

- Another way is to fake the number of pages; set it to 1 where it is 0 
everywhere. I verified that this produces fast query performance. But 
fiddling with pg_class does not "feel" right... Is this okay to use in a 
test setup (and maybe even in a production scenario)? Could I do the 
following query just after our database is initialized:

update pg_catalog.pg_class c set relpages = 1 where c.relpages = 0 and 
c.relnamespace = (select ns.oid from pg_catalog.pg_namespace ns where 
ns.nspname = current_schema);

Could someone give me advice on what would be the best strategy?

With kind regads,

Jurrie

[1] https://postgr.es/m/F02298E0-6EF4-49A1-BCB6-C484794D9ACC@thebuild.com
[2] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3d351d916b20534f973eda760cde17d96545d4c4





pgsql-general by date:

Previous
From: "xiebin (F)"
Date:
Subject: 答复: Security issues concerning pgsql replication
Next
From: Magnus Hagander
Date:
Subject: Re: Security issues concerning pgsql replication