Re: Incorrect choice of Nested Loop for a skewed distribution - Mailing list pgsql-performance

From Oleg Kharin
Subject Re: Incorrect choice of Nested Loop for a skewed distribution
Date
Msg-id a5fb2f9d-a8ac-2c8a-22fc-6d5f62f6d6b3@uvadrev.ru
Whole thread Raw
In response to Re: Incorrect choice of Nested Loop for a skewed distribution  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-performance
Thank you Justin!

On Wed, 04 Sep 2019 17:18:36 -0700 (PDT), Justin Pryzby wrote:

> When you dropped the index, you necessarily refused the plan involving 
> index scan.
> So it did merge join instead (which it thinks of as expensive because it has to
> sort both sides).
>
> As you saw, the rowcount estimate of the join result is badly off.  But that's
> true of both plans.
>
> Choice of join type is affected by the size of its *inputs*, but doesn't and
> shouldn't have any effect on the result rowcount (or its) estimate.  The
> rowcount *output* of the join would only affect any *following* plan nodes (of
> which there are none in this case).
>
> You suggested using the MCV list, but I don't think that's possible, since the
> nested loop is evaluating its "inner" table multiple times, in a "loop":
>
>>     ->  Index Scan using t_f on t  (cost=0.29..0.36 rows=1 width=8) (actual time=0.565..1.125 rows=1 loops=20020)
> Hypothetically, one could plan the query 20k times, for each value of u.f and
> u.g, but that's tantamount to actually executing the query, which is why it
> uses (I gather) var_eq_non_const.

In fact the planner has information about the outer loop relation when 
it is estimating the number of inner loop rows for a nested-loop join. 
It ignores this information and considers the outer/inner loop relations 
as independent. So it uses for the rowcount estimate the number of 
distinct values only, not MCV lists of both tables.

For the test case above, the planner estimates the selectivity of the 
clause "t.f=u.f" as 1/10011. It hopes to scan 1/10011*20020=2 rows in a 
inner loop for each row of the outer loop. In fact it scans 10010 rows 
10010 times and 1 row 10010 times. The average number of rows scanned in 
a inner loop is (10010*10010+10010)/20020=5005. That is badly off from 2 
rows expected. The planner should use 5005/20020 as a more accurate 
value for the effective selectivity of "t.f=u.f".

I tried to make improvements to the function var_eq_non_const() so that 
it calculates the effective selectivity using MCV lists if possible. The 
patch for PostgreSQL 11.5 is attached to the letter.

The patched PostgreSQL chooses an optimal plan for the test case. As a 
result the query execution time  is reduced by more than 600 times.

Now if we force the planner to choose the nested-loop join

   set enable_mergejoin=off;

   set enable_hashjoin=off;

   explain analyze select * from t,u where t.f=u.f and t.g=u.g;

  Nested Loop  (cost=0.29..2261681.75 rows=25055030 width=16) (actual 
time=0.048..22519.232 rows=20020 loops=1)
    ->  Seq Scan on u  (cost=0.00..289.20 rows=20020 width=8) (actual 
time=0.012..2.970 rows=20020 loops=1)
    ->  Index Scan using t_f on t  (cost=0.29..100.44 rows=1252 width=8) 
(actual time=0.565..1.124 rows=1 loops=20020)
          Index Cond: (f = u.f)
          Filter: (u.g = g)
          Rows Removed by Filter: 5004
  Planning Time: 0.188 ms
  Execution Time: 22521.248 ms

we will see that the cost of the index scan is increased from 0.36 to 
100.44, which is much more realistic and reflects 2 to 5005 rows ratio.

Regards,

Oleg


Attachment

pgsql-performance by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Incorrect choice of Nested Loop for a skewed distribution
Next
From: yash mehta
Date:
Subject: select distinct runs slow on pg 10.6