Home > mailing lists

Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE ASwith same SELECT takes 7 minutes - Mailing list pgsql-performance

From	Arjun Ranade
Subject	Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE ASwith same SELECT takes 7 minutes
Date	September 28, 2018 01:37:57
Msg-id	CANrrCRy_HSsYCTDVH+EtDoBJzFiEob3Nd5bztmno5V13WFWOjQ@mail.gmail.com Whole thread Raw
In response to	Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLEAS with same SELECT takes 7 minutes (Justin Pryzby <pryzby@telsasoft.com>)
Responses	Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLEAS with same SELECT takes 7 minutes
List	pgsql-performance

Tree view

Yes, that join is concerning (red text below). The conditions all need to be checked so they are independent.

The query (with consistent obfuscation) is below :

select distinct
   a.sale_id
   , a.test_date
   , a.product_id as original_product_id
   ,vw2.product_id
   , a.volume     as volume
   ,b.pair_rank
from not_sold_locations a
   inner join vw_product vw2 using (product_group_name,product_class_code,product_type_code,sale_end_date)
   inner join product_mapping b on a.product_group_name = b.left_product_group_name and
                           a.product_node_name = b.left_product_node and
                           a.product_type_code = b.left_product and
                           vw2.product_node_name = b.right_product_node and
                           vw2.product_group_name = b.right_product_group_name and
                           vw2.product_type_code = b.right_product
   inner join mapping_ref i on vw2.product_group_name || '.' || vw2.product_node_name = i.product_node_name and
                        vw2.product_class_code = i.product_class_code and
                        vw2.product_type_code = i.product_type_code and
                        vw2.sale_end_date between i.first_product_date and i.last_product_date;

not_sold_locations(a) has 836 rows
vw_product (vw2) has 785k rows and is a view that joins 11 tables together to have a consolidated view of all products, sales locations, etc

product_mapping (b) has 2520 rows

mapping_ref (i) has 178 rows

On Thu, Sep 27, 2018 at 2:52 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

On Thu, Sep 27, 2018 at 01:08:05PM -0400, Arjun Ranade wrote:
> When I look at the EXPLAIN ANALYZE output, it seems that it's using a
> drastically different query plan for the INSERT+SELECT than SELECT by
> itself.

The fast, SELECT plan is using parallel query, which isn't available for INSERT+SELECT:

https://www.postgresql.org/docs/current/static/when-can-parallel-query-be-used.html
|Even when it is in general possible for parallel query plans to be generated, the planner will not generate them for a given query if any of the following are true:
|The query writes any data or locks any database rows.

Using parallel query in this case happens to mitigate the effects of the bad
plan.

I see Tom responded, and you got an improvement by changing join threshold.

But I think you could perhaps get an better plan if the rowcount estimates were
fixed. That's more important than probably anything else - changing settings
is only a workaround for bad estimates.

In the slow/INSERT plan, this join is returning 55000x more rows than expected
(not 55k more: 55k TIMES more).

7. 26,937.132 401,503.136 ↓ 55,483.7 332,902 1
Nested Loop (cost=1,516.620..42,244.240 rows=6 width=84) (actual time=311.021..401,503.136 rows=332,902 loops=1)
Join Filter: (((papa_echo.oscar_bravo)::text = (five_hotel.tango_november)::text) AND ((papa_echo.lima_tango)::text = (five_hotel.lima_mike)::text) AND ((xray_juliet1.juliet)::text = (five_hotel.papa_victor)::text))
Rows Removed by Join Filter: 351664882
Buffers: shared hit=8570619 read=6

First question is if all those conditions are independent? Or if one of those
conditions also implies another, which is confusing the planner.

Justin

pgsql-performance by date:

From: Justin Pryzby
Date: 28 September 2018, 01:33:14
Subject: Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLEAS with same SELECT takes 7 minutes

From: Arjun Ranade
Date: 28 September 2018, 01:51:32
Subject: Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE ASwith same SELECT takes 7 minutes

Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE ASwith same SELECT takes 7 minutes - Mailing list pgsql-performance

Previous

Next