Home > mailing lists

Re: NL Join vs Merge Join - 5 hours vs 2 seconds - Mailing list pgsql-general

From	Tom Lane
Subject	Re: NL Join vs Merge Join - 5 hours vs 2 seconds
Date	December 17, 2018 17:02:12
Msg-id	32743.1545066132@sss.pgh.pa.us Whole thread Raw
In response to	NL Join vs Merge Join - 5 hours vs 2 seconds ("Kumar, Virendra" <Virendra.Kumar@guycarp.com>)
Responses	RE: NL Join vs Merge Join - 5 hours vs 2 seconds
List	pgsql-general

Tree view

"Kumar, Virendra" <Virendra.Kumar@guycarp.com> writes:
> We are using Postgers-XL based on PostgreSQL 10.5 on RHEL7. We have an extreme case of one query which running in 4
hour45 mins and 33 seconds with Nested Loops Join vs 2.5 seconds when we disable enable_nestloop. The query is given
below.I would be happy to share more information if required. 

It looks like what's burning you is bad estimates at these joins:

>                                        ->  Nested Loop  (cost=33681.00..71431.33 rows=1 width=348) (actual
time=359.006..926.303rows=37207 loops=3) 
>                                              Join Filter: (se.account_id = a.account_id)
>                                              ->  Hash Join  (cost=33680.71..71298.94 rows=395 width=300) (actual
time=358.953..656.923rows=36828 loops=3) 
>                                                    Hash Cond: ((se.account_id = sh.account_id) AND (se.site_id =
sh.site_id))
...
>                                              ->  Index Scan using account_p1000013_account_id_idx on account_p1000013
a (cost=0.29..0.32 rows=1 width=52) (actual time=0.003..0.005 rows=1 loops=110485) 
>                                                    Index Cond: (account_id = sh.account_id)
>                                                    Filter: (portfolio_id = 1000013)

I'm guessing that account_id, site_id, and portfolio_id are all highly
correlated, but the planner doesn't know that and thinks the additional
conditions will remove way more rows than they actually do.

In PG10 and later, you can probably fix that by creating
functional-dependency statistics on those pairs of columns.

            regards, tom lane

pgsql-general by date:

From: Ron
Date: 17 December 2018, 16:16:06
Subject: Re: conditionally terminate psql script

From:
Date: 17 December 2018, 17:27:08
Subject: Re: Creating 2D arrays for pg_copy_from, reading tab-delimted textfile that contains comma and double quotes

Re: NL Join vs Merge Join - 5 hours vs 2 seconds - Mailing list pgsql-general

Previous

Next