BUG #17330: EXPLAIN hangs and very long query plans - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #17330: EXPLAIN hangs and very long query plans
Date
Msg-id 17330-0d6d4cf6ff440a65@postgresql.org
Whole thread Raw
Responses Re: BUG #17330: EXPLAIN hangs and very long query plans
Re: BUG #17330: EXPLAIN hangs and very long query plans
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      17330
Logged by:          Strahinja Kustudic
Email address:      strahinjak@nordeus.com
PostgreSQL version: 10.19
Operating system:   CentOS 7
Description:

We had an issue with one of our production databases running Postgres 10.19
on CentOS 7. One of the most often executed queries started having very long
query plans of 3000ms+, while the execution duration was 1ms-3ms. Query
plans when everything is working regularly were around 1ms or less. While on
the replica it didn't even want to finish EXPLAIN (without ANALYZE, just
EXPLAIN!). EXPLAIN would just hang forever. To be precise we were running
10.10 at that time, but upgrading to 10.19 didn't help. We tried running
ANALYZE on the whole database, but that didn't help. In the end, what helped
is running pg_repack on the whole DB. This was strange because I thought
that the query planner is using table statistics and the index schema to
determine what plan to run, it shouldn't need table/index data to make a
plan, but I don't know PG internals, so I might be wrong.

Because we have backups created by WAL-G, I've restored the base backup with
WALs from that time and I've managed to recreate the issue in Docker on my
own laptop. The issue is that in Docker EXPLAIN runs forever, it never
finishes, like the issue we had on the replica. This means I have a 100%
repro of the issue and I can do whatever you tell me to do to find out what
caused it. I already tried enabling `log_error_verbosity = verbose` but the
server doesn't print anything while EXPLAIN is running and it never
finishes, I haven't tried to leave it to run for hours (but I might).

Is this a known bug in PG 10.X? Or would you like me to share more details
about the table definitions, the exact query and anything you need?


pgsql-bugs by date:

Previous
From: Max Neverov
Date:
Subject: Re: BUG #17329: Aggregate Functions Precision Error
Next
From: PG Bug reporting form
Date:
Subject: BUG #17331: Minor change in install steps