Thread: Random plan selection in DBT-3

Random plan selection in DBT-3

From
ITAGAKI Takahiro
Date:
I tested 8.3beta4 with DBT-3 (TPC-H) and found unstable selection of plans.
Planner randomly selected two types of plans (A, B) when I repeated EXPLAIN.

One of the conditions is used in Seq Scan Filter in Plan A. In contrast,
the same condition is used in Hash Join filter in Plan B. Plan A is
faster than B because the condition is used early. Cost of the plan A is
also cheap then B (A:307307.47 vs. B:351706.51).

Where did the randomness come from? Are there large randomness in planner?
I wonder why the worse plan is picked in spite of significantly different costs.

---- Outline of Plan A ----Hash Join  -> Seq Scan       Filter: (*condition*)  -> Hash       -> Seq Scan

---- Outline of Plan B  ----Hash Join   Join Filter: (*condition*)   -> Seq Scan   -> Hash        -> Seq Scan

---- Query ----
dbt3=# EXPLAIN SELECT sum(l_extendedprice* (1 - l_discount)) AS revenueFROM lineitem, partWHERE ( p_partkey = l_partkey
  AND p_brand = 'Brand#53'    AND p_container IN ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')    AND l_quantity >= 10
ANDl_quantity <= 10+10    AND p_size BETWEEN 1 AND 5    AND l_shipmode IN ('AIR', 'AIR REG')    AND l_shipinstruct =
'DELIVERIN PERSON' )    OR ( p_partkey = l_partkey    AND p_brand = 'Brand#45'    AND p_container IN ('MED BAG', 'MED
BOX','MED PKG', 'MED PACK')    AND l_quantity >= 20    AND l_quantity <= 20+10    AND p_size BETWEEN 1 AND 10    AND
l_shipmodeIN ('AIR', 'AIR REG')    AND l_shipinstruct = 'DELIVER IN PERSON' )    OR ( p_partkey = l_partkey    AND
p_brand= 'Brand#31'    AND p_container IN ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')    AND l_quantity >= 30    AND
l_quantity<= 30+10    AND p_size BETWEEN 1 AND 15    AND l_shipmode IN ('AIR', 'AIR REG')    AND l_shipinstruct =
'DELIVERIN PERSON' );
 

---- Details of Plan A ----Aggregate  (cost=307307.46..307307.47 rows=1 width=8)  ->  Hash Join
(cost=10432.00..307307.17rows=112 width=8)        Hash Cond: (lineitem.l_partkey = part.p_partkey)        Join Filter:
(((part.p_brand= 'Brand#53'::bpchar)                 AND (part.p_container = ANY ('{"SM CASE","SM BOX","SM PACK","SM
PKG"}'::bpchar[]))                AND (lineitem.l_quantity >= 10::double precision)                 AND
(lineitem.l_quantity<= 20::double precision)                 AND (part.p_size <= 5))                  OR ((part.p_brand
='Brand#45'::bpchar)                 AND (part.p_container = ANY ('{"MED BAG","MED BOX","MED PKG","MED
PACK"}'::bpchar[]))                AND (lineitem.l_quantity >= 20::double precision)                 AND
(lineitem.l_quantity<= 30::double precision)                 AND (part.p_size <= 10)) OR ((part.p_brand =
'Brand#31'::bpchar)                AND (part.p_container = ANY ('{"LG CASE","LG BOX","LG PACK","LG PKG"}'::bpchar[]))
             AND (lineitem.l_quantity >= 30::double precision)                 AND (lineitem.l_quantity <= 40::double
precision)                AND (part.p_size <= 15)))        ->  Seq Scan on lineitem  (cost=0.00..287087.85 rows=112605
width=16)
(*)            Filter: ((l_shipmode = ANY ('{AIR,"AIR REG"}'::bpchar[]))                  AND (l_shipinstruct =
'DELIVERIN PERSON'::bpchar)                  AND (((l_quantity >= 10::double precision)                  AND
(l_quantity<= 20::double precision))                   OR ((l_quantity >= 20::double precision)                  AND
(l_quantity<= 30::double precision))                   OR ((l_quantity >= 30::double precision)                  AND
(l_quantity<= 40::double precision))))        ->  Hash  (cost=6564.00..6564.00 rows=200000 width=30)              ->
SeqScan on part  (cost=0.00..6564.00 rows=200000 width=30)                    Filter: (p_size >= 1)
 

---- Details of Plan B ----Aggregate  (cost=351706.50..351706.51 rows=1 width=8)  ->  Hash Join
(cost=10432.00..351706.22rows=112 width=8)        Hash Cond: (lineitem.l_partkey = part.p_partkey)        Join Filter:
(((part.p_brand= 'Brand#53'::bpchar)                 AND (part.p_container = ANY ('{"SM CASE","SM BOX","SM PACK","SM
PKG"}'::bpchar[]))                AND (lineitem.l_quantity >= 10::double precision)                 AND
(lineitem.l_quantity<= 20::double precision)                 AND (part.p_size <= 5)
 
(*)               AND (lineitem.l_shipmode = ANY ('{AIR,"AIR REG"}'::bpchar[])))                  OR ((part.p_brand =
'Brand#45'::bpchar)                AND (part.p_container = ANY ('{"MED BAG","MED BOX","MED PKG","MED
PACK"}'::bpchar[]))                AND (lineitem.l_quantity >= 20::double precision)                 AND
(lineitem.l_quantity<= 30::double precision)                 AND (part.p_size <= 10)
 
(*)               AND (lineitem.l_shipmode = ANY ('{AIR,"AIR REG"}'::bpchar[])))                  OR ((part.p_brand =
'Brand#31'::bpchar)                AND (part.p_container = ANY ('{"LG CASE","LG BOX","LG PACK","LG PKG"}'::bpchar[]))
             AND (lineitem.l_quantity >= 30::double precision)                 AND (lineitem.l_quantity <= 40::double
precision)                AND (part.p_size <= 15)
 
(*)               AND (lineitem.l_shipmode = ANY ('{AIR,"AIR REG"}'::bpchar[]))))        ->  Seq Scan on lineitem
(cost=0.00..272084.61rows=787029 width=27)              Filter: ((l_shipinstruct = 'DELIVER IN PERSON'::bpchar)
        AND (((l_quantity >= 10::double precision)                  AND (l_quantity <= 20::double precision))
       OR ((l_quantity >= 20::double precision)                  AND (l_quantity <= 30::double precision))
    OR ((l_quantity >= 30::double precision)                  AND (l_quantity <= 40::double precision))))        ->
Hash (cost=6564.00..6564.00 rows=200000 width=30)              ->  Seq Scan on part  (cost=0.00..6564.00 rows=200000
width=30)                   Filter: (p_size >= 1)
 



Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



Re: Random plan selection in DBT-3

From
Tom Lane
Date:
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> I tested 8.3beta4 with DBT-3 (TPC-H) and found unstable selection of plans.
> Planner randomly selected two types of plans (A, B) when I repeated EXPLAIN.

Are you sure the statistics weren't changing underneath you (due to
auto-vacuum or auto-analyze)?
        regards, tom lane


Re: Random plan selection in DBT-3

From
ITAGAKI Takahiro
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> > I tested 8.3beta4 with DBT-3 (TPC-H) and found unstable selection of plans.
> > Planner randomly selected two types of plans (A, B) when I repeated EXPLAIN.
> 
> Are you sure the statistics weren't changing underneath you (due to
> auto-vacuum or auto-analyze)?

Sure. There were no statistic changes.

There might be regularity in the plan selection. When I connect to postgres
newly, I see plans { A, B, A, A ... }. Then disconnect, re-connect and send
the same EXPLAINs, the same pattern was reproduced.


Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center




Re: Random plan selection in DBT-3

From
Tom Lane
Date:
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> There might be regularity in the plan selection. When I connect to postgres
> newly, I see plans { A, B, A, A ... }. Then disconnect, re-connect and send
> the same EXPLAINs, the same pattern was reproduced.

Hmph ... maybe some effect associated with loading up the backend's
internal caches?  I can't imagine what though.

Please see if you can extract a more self-contained test case.
        regards, tom lane