Thread: BUG #15722: 9000x performance regression on query starting PostgreSQL 10 due to bad query plan

The following bug has been logged on the website:

Bug reference:      15722
Logged by:          Jeremy Evans
Email address:      jeremyevans0@gmail.com
PostgreSQL version: 11.2
Operating system:   Linux
Description:

For one complex query, we are seeing a huge performance regression starting
with PostgreSQL 10.  We noticed this regression starting with an upgrade
from 9.3 to 11.  We then tested on PostgreSQL 9.3, 9.4, 9.5, 9.6, 10, and 11
with this query by restoring a database dump and running the query.  On
PostgreSQL 9.3-9.6, this query runs in a few seconds.  In PostgreSQL 10 and
11 it takes multiple hours.

We tried manually running VACUUM and ANALYZE commands before running the
query, with no significant effect.  As the restoration of the database dump
was done just before running this query, there should be no need to REINDEX.
 This analysis was repeated on a separate physical server with a different
hardware configuration but similar PostgreSQL configuration, also on 9.3,
9.4, 9.5, 9.6, 10, and 11, with the same results.

The cause of this problem seems to be PostgreSQL underestimating the number
of rows needed by a hash join by almost 9 million times.  On PostgreSQL
10-11, the expensive part of the query is a Nested Loop Inner Join of a Hash
Inner Join and a Index Scan.  On PostgreSQL 9.3-9.6, this part of the query
uses a Nested Loop Semi Join of the Hash Inner Join and Index Scan.

The following files are available at
https://gist.github.com/jeremyevans/8a9693b1d3f14c94f635b1bf4275b2be

* Raw SQL Query
* Formatted SQL Query
* Database Schema Dump
* EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON) output for: 
  * 9.3
  * 9.4
  * 9.5
  * 9.6
  * 10
  * 11

Due to the consistently good behavior in 9.3-9.6, and the severity of the
performance decrease in 10-11, I think this is most likely a regression in
PostgreSQL's query planner or optimizer, which is why I'm posting this as a
bug report.  If this is not considered a bug, please let me know and I can
post to pgsql-performance instead.

Any and all help greatly appreciated.  If more information would be helpful,
please let me know.

Thanks,
Jeremy


On Fri, Mar 29, 2019 at 6:53 AM PG Bug reporting form <noreply@postgresql.org> wrote:

The following files are available at
https://gist.github.com/jeremyevans/8a9693b1d3f14c94f635b1bf4275b2be


Could you also include the human readable explain plans (even just for 9.6. and 10)? JSON objects with 1900 lines might be good for machines, but not for people.  (Of course the human readable format might not be all that readable either, but it is worth a shot)
 
Cheers,

Jeff
On Tue, Apr 2, 2019 at 10:11 AM Jeff Janes <jeff.janes@gmail.com> wrote:
On Fri, Mar 29, 2019 at 6:53 AM PG Bug reporting form <noreply@postgresql.org> wrote:

The following files are available at
https://gist.github.com/jeremyevans/8a9693b1d3f14c94f635b1bf4275b2be


Could you also include the human readable explain plans (even just for 9.6. and 10)? JSON objects with 1900 lines might be good for machines, but not for people.  (Of course the human readable format might not be all that readable either, but it is worth a shot)
 
Cheers,

Jeff

Jeff,

Sorry for the delay, it took a while to rerun the query on 10.  I've updated the gist to add plain text query plans for 9.6 and 10:


If you would like the query plans for 9.3-9.5 or 11, please let me know. Also, here are links to explain.depesz.com for both query plans:


I originally thought the JSON plans would be more useful, using a viewer such as http://tatiyants.com/pev, that is why I didn't include plain text plans when submitting the bug report.

Thanks,
Jeremy