Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLEAS with same SELECT takes 7 minutes - Mailing list pgsql-performance

From Justin Pryzby
Subject Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLEAS with same SELECT takes 7 minutes
Date
Msg-id 20180927185224.GF776@telsasoft.com
Whole thread Raw
In response to SELECT statement returns in 10seconds, but INSERT/CREATE TABLE ASwith same SELECT takes 7 minutes  (Arjun Ranade <ranade@nodalexchange.com>)
Responses Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE ASwith same SELECT takes 7 minutes  (Arjun Ranade <ranade@nodalexchange.com>)
List pgsql-performance
On Thu, Sep 27, 2018 at 01:08:05PM -0400, Arjun Ranade wrote:
> When I look at the EXPLAIN ANALYZE output, it seems that it's using a
> drastically different query plan for the INSERT+SELECT than SELECT by
> itself.

The fast, SELECT plan is using parallel query, which isn't available for INSERT+SELECT:

https://www.postgresql.org/docs/current/static/when-can-parallel-query-be-used.html
|Even when it is in general possible for parallel query plans to be generated, the planner will not generate them for a
givenquery if any of the following are true:
 
|The query writes any data or locks any database rows.

Using parallel query in this case happens to mitigate the effects of the bad
plan.

I see Tom responded, and you got an improvement by changing join threshold.

But I think you could perhaps get an better plan if the rowcount estimates were
fixed.  That's more important than probably anything else - changing settings
is only a workaround for bad estimates.

In the slow/INSERT plan, this join is returning 55000x more rows than expected
(not 55k more: 55k TIMES more).

7.     26,937.132     401,503.136     ↓ 55,483.7     332,902     1     
Nested Loop (cost=1,516.620..42,244.240 rows=6 width=84) (actual time=311.021..401,503.136 rows=332,902 loops=1)
    Join Filter: (((papa_echo.oscar_bravo)::text = (five_hotel.tango_november)::text) AND ((papa_echo.lima_tango)::text
=(five_hotel.lima_mike)::text) AND ((xray_juliet1.juliet)::text = (five_hotel.papa_victor)::text))
 
    Rows Removed by Join Filter: 351664882
    Buffers: shared hit=8570619 read=6

First question is if all those conditions are independent?  Or if one of those
conditions also implies another, which is confusing the planner.

Justin


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes
Next
From: Arjun Ranade
Date:
Subject: Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE ASwith same SELECT takes 7 minutes