Re: optimizer picks smaller table to drive nested loops? - Mailing list pgsql-performance

From Tom Lane
Subject Re: optimizer picks smaller table to drive nested loops?
Date
Msg-id 10251.1058205875@sss.pgh.pa.us
Whole thread Raw
In response to optimizer picks smaller table to drive nested loops?  (Greg Stark <gsstark@mit.edu>)
Responses Re: optimizer picks smaller table to drive nested loops?
List pgsql-performance
Greg Stark <gsstark@mit.edu> writes:
> slo=> explain analyze select * from region, (select 1 union all select 2) as x;
>                                                        QUERY PLAN
  
>
------------------------------------------------------------------------------------------------------------------------
>  Nested Loop  (cost=0.00..11162.00 rows=5534 width=108) (actual time=0.13..541.19 rows=5534 loops=1)
>    ->  Subquery Scan x  (cost=0.00..2.00 rows=2 width=0) (actual time=0.03..0.08 rows=2 loops=1)
>          ->  Append  (cost=0.00..2.00 rows=2 width=0) (actual time=0.02..0.05 rows=2 loops=1)
>                ->  Subquery Scan "*SELECT* 1"  (cost=0.00..1.00 rows=1 width=0) (actual time=0.01..0.02 rows=1
loops=1)
>                      ->  Result  (cost=0.00..1.00 rows=1 width=0) (actual time=0.01..0.01 rows=1 loops=1)
>                ->  Subquery Scan "*SELECT* 2"  (cost=0.00..1.00 rows=1 width=0) (actual time=0.01..0.02 rows=1
loops=1)
>                      ->  Result  (cost=0.00..1.00 rows=1 width=0) (actual time=0.01..0.01 rows=1 loops=1)
>    ->  Seq Scan on region  (cost=0.00..2813.00 rows=2767 width=104) (actual time=0.03..123.44 rows=2767 loops=2)
>  Total runtime: 566.24 msec
> (9 rows)

> Wouldn't it be faster to drive the nested loop the other way around?

You seem to be using a rather wacko value of cpu_tuple_cost; those
Result nodes ought to be costed at 0.01 not 1.00.  With the default
cost settings I get an other-way-around plan for a similar test.
(I used tenk1 from the regression database as the outer table.)

However, it looks to me like the subquery-scan-outside plan probably
is the faster one, on both my machine and yours.  I get

regression=# explain analyze select * from tenk1, (select 1 union all select 2) as x;
                                                         QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..858.00 rows=20000 width=248) (actual time=0.42..3648.61 rows=20000 loops=1)
   ->  Seq Scan on tenk1  (cost=0.00..458.00 rows=10000 width=244) (actual time=0.23..199.97 rows=10000 loops=1)
   ->  Subquery Scan x  (cost=0.00..0.02 rows=2 width=0) (actual time=0.07..0.24 rows=2 loops=10000)
         ->  Append  (cost=0.00..0.02 rows=2 width=0) (actual time=0.05..0.17 rows=2 loops=10000)
               ->  Subquery Scan "*SELECT* 1"  (cost=0.00..0.01 rows=1 width=0) (actual time=0.03..0.06 rows=1
loops=10000)
                     ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.01..0.02 rows=1 loops=10000)
               ->  Subquery Scan "*SELECT* 2"  (cost=0.00..0.01 rows=1 width=0) (actual time=0.03..0.06 rows=1
loops=10000)
                     ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.01..0.02 rows=1 loops=10000)
 Total runtime: 3807.39 msec
(9 rows)

regression=# set cpu_tuple_cost = 1;
SET
regression=# explain analyze select * from tenk1, (select 1 union all select 2) as x;
                                                       QUERY PLAN

------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..40718.00 rows=20000 width=248) (actual time=0.39..1214.42 rows=20000 loops=1)
   ->  Subquery Scan x  (cost=0.00..2.00 rows=2 width=0) (actual time=0.10..0.31 rows=2 loops=1)
         ->  Append  (cost=0.00..2.00 rows=2 width=0) (actual time=0.06..0.22 rows=2 loops=1)
               ->  Subquery Scan "*SELECT* 1"  (cost=0.00..1.00 rows=1 width=0) (actual time=0.05..0.08 rows=1 loops=1)
                     ->  Result  (cost=0.00..1.00 rows=1 width=0) (actual time=0.03..0.04 rows=1 loops=1)
               ->  Subquery Scan "*SELECT* 2"  (cost=0.00..1.00 rows=1 width=0) (actual time=0.05..0.08 rows=1 loops=1)
                     ->  Result  (cost=0.00..1.00 rows=1 width=0) (actual time=0.02..0.03 rows=1 loops=1)
   ->  Seq Scan on tenk1  (cost=0.00..10358.00 rows=10000 width=244) (actual time=0.17..188.37 rows=10000 loops=2)
 Total runtime: 1371.17 msec
(9 rows)

The flipover point between the two plans is cpu_tuple_cost = 0.04 in
my tests.

It looks to me like we've neglected to charge any cost associated with
Subquery Scan or Append nodes.  Certainly Subquery Scan ought to charge
at least a cpu_tuple_cost per row.  Perhaps Append ought to as well ---
although since it doesn't do selection or projection, I'm not quite sure
where the time is going in that case.  (Hmmm... time to get out the
profiler...)

            regards, tom lane

pgsql-performance by date:

Previous
From: "Nick Fankhauser"
Date:
Subject: Re: Tunning FreeeBSD and PostgreSQL
Next
From: Greg Stark
Date:
Subject: Re: optimizer picks smaller table to drive nested loops?