Planner really hates nested loops - Mailing list pgsql-performance

From David Brown
Subject Planner really hates nested loops
Date
Msg-id 420206AD.8040206@bigpond.net.au
Whole thread Raw
Responses Re: Planner really hates nested loops
List pgsql-performance
I'm hoping someone can shed some light on these results. The 'factor'
compares the ratios of cost to actual for different plans. Perhaps
nested loops should be given a discount in the planner? The estimates
seem to be out by one and a half orders of magnitude. :(

========== QUERY ==========

SELECT sum(L.Extended)
FROM sord H
JOIN sordln L USING (OrderNo)
[ WHERE H.OrderDate between '2003-01-01' and '2003-03-16' ]
[ WHERE H.OrderDate between '2003-01-01' and '2003-09-02' ]

========== SUMMARY ==========

Join      Cost    Cache    Factor    Disk    Factor
------------------------------------------------------------
  10% ROWS
Hash      40085    4.9s      1.0    12.8s      1.0
Merge     63338    4.1s      1.9    23.1s      0.9
Hash Idx  65386    5.5s      1.5    30.7s      0.7
Nest     257108    0.8s     39.3     2.7s     30.4
  33% ROWS
Hash      43646    5.8s      1.0    13.6s      1.0
Merge     67153    6.0s      1.5    30.7s      0.7
Hash Idx  68946    6.5s      1.4
Nest     868642    2.8s     41.2    10.2s     26.5
  ALL ROWS
Hash      53458    8.9s      1.0    14.3s      1.0
Merge     76156    9.4s      1.3    35.2s      0.6
Nest    2594934    9.2s     47.0    33.8s     20.5

========== 10% CACHE ROWS ==========

QUERY PLAN (Hash Join on <10% cache rows, indexed + sequential)
Aggregate  (cost=40085.14..40085.14 rows=1 width=8) (actual
time=4907.000..4907.000 rows=1 loops=1)
  ->  Hash Join  (cost=145.11..39814.32 rows=108324 width=8) (actual
time=3844.000..4735.000 rows=96183 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Seq Scan on sordln l  (cost=0.00..33118.98 rows=1093398
width=12) (actual time=0.000..2313.000 rows=1093398 loops=1)
        ->  Hash  (cost=138.48..138.48 rows=2655 width=4) (actual
time=16.000..16.000 rows=0 loops=1)
              ->  Index Scan using sord_date on sord h
(cost=0.00..138.48 rows=2655 width=4) (actual time=0.000..0.000
rows=2646 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-03-16'::date))
Total runtime: 4907.000 ms

QUERY PLAN (Merge Join on <10% cache rows, indexed only)
Aggregate  (cost=63338.43..63338.43 rows=1 width=8) (actual
time=4141.000..4141.000 rows=1 loops=1)
  ->  Merge Join  (cost=289.47..63067.62 rows=108324 width=8) (actual
time=3000.000..3896.000 rows=96183 loops=1)
        Merge Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual time=0.000..2058.000
rows=737827 loops=1)
        ->  Sort  (cost=289.47..296.11 rows=2655 width=4) (actual
time=16.000..127.000 rows=96174 loops=1)
              Sort Key: h.orderno
              ->  Index Scan using sord_date on sord h
(cost=0.00..138.48 rows=2655 width=4) (actual time=0.000..0.000
rows=2646 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-03-16'::date))
Total runtime: 4141.000 ms

QUERY PLAN (Hash Join on <10% cache rows, indexed only)
Aggregate  (cost=65385.95..65385.95 rows=1 width=8) (actual
time=5516.000..5516.000 rows=1 loops=1)
  ->  Hash Join  (cost=145.11..65115.13 rows=108324 width=8) (actual
time=3031.000..5376.000 rows=96183 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual time=0.000..3091.000
rows=1093398 loops=1)
        ->  Hash  (cost=138.48..138.48 rows=2655 width=4) (actual
time=0.000..0.000 rows=0 loops=1)
              ->  Index Scan using sord_date on sord h
(cost=0.00..138.48 rows=2655 width=4) (actual time=0.000..0.000
rows=2646 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-03-16'::date))
Total runtime: 5516.000 ms

QUERY PLAN (Nested Loop on <10% cache rows, indexed only)
Aggregate  (cost=257108.11..257108.11 rows=1 width=8) (actual
time=781.000..781.000 rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..256837.30 rows=108324 width=8) (actual
time=0.000..610.000 rows=96183 loops=1)
        ->  Index Scan using sord_date on sord h  (cost=0.00..138.48
rows=2655 width=4) (actual time=0.000..0.000 rows=2646 loops=1)
              Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-03-16'::date))
        ->  Index Scan using sordln_pkey on sordln l  (cost=0.00..96.01
rows=54 width=12) (actual time=0.000..0.118 rows=36 loops=2646)
              Index Cond: ("outer".orderno = l.orderno)
Total runtime: 781.000 ms

========== 33% CACHE ROWS ==========

QUERY PLAN (Hash Join on >33% cache rows, indexed + sequential)
Aggregate  (cost=43645.62..43645.62 rows=1 width=8) (actual
time=5828.000..5828.000 rows=1 loops=1)
  ->  Hash Join  (cost=484.94..42730.67 rows=365976 width=8) (actual
time=2391.000..5078.000 rows=352856 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Seq Scan on sordln l  (cost=0.00..33118.98 rows=1093398
width=12) (actual time=0.000..2234.000 rows=1093398 loops=1)
        ->  Hash  (cost=462.52..462.52 rows=8970 width=4) (actual
time=47.000..47.000 rows=0 loops=1)
              ->  Index Scan using sord_date on sord h
(cost=0.00..462.52 rows=8970 width=4) (actual time=0.000..0.000
rows=8934 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-09-02'::date))
Total runtime: 5828.000 ms

QUERY PLAN (Merge Join on >33% cache rows, indexed only)
Aggregate  (cost=67153.04..67153.04 rows=1 width=8) (actual
time=5985.000..5985.000 rows=1 loops=1)
  ->  Merge Join  (cost=0.00..66238.09 rows=365976 width=8) (actual
time=2953.000..5281.000 rows=352856 loops=1)
        Merge Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sord_pkey on sord h  (cost=0.00..1402.78
rows=8970 width=4) (actual time=31.000..46.000 rows=8934 loops=1)
              Filter: ((orderdate >= '2003-01-01'::date) AND (orderdate
<= '2003-09-02'::date))
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual time=0.000..2485.000
rows=994500 loops=1)
Total runtime: 5985.000 ms

QUERY PLAN (Hash Join on >33% cache rows, indexed only)
Aggregate  (cost=68946.43..68946.43 rows=1 width=8) (actual
time=6531.000..6531.000 rows=1 loops=1)
  ->  Hash Join  (cost=484.94..68031.48 rows=365976 width=8) (actual
time=3031.000..5765.000 rows=352856 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual time=0.000..3075.000
rows=1093398 loops=1)
        ->  Hash  (cost=462.52..462.52 rows=8970 width=4) (actual
time=46.000..46.000 rows=0 loops=1)
              ->  Index Scan using sord_date on sord h
(cost=0.00..462.52 rows=8970 width=4) (actual time=0.000..16.000
rows=8934 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-09-02'::date))
Total runtime: 6531.000 ms

QUERY PLAN (Nested Loop on >33% cache rows, indexed only)
Aggregate  (cost=868642.40..868642.40 rows=1 width=8) (actual
time=2828.000..2828.000 rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..867727.46 rows=365976 width=8) (actual
time=0.000..2171.000 rows=352856 loops=1)
        ->  Index Scan using sord_date on sord h  (cost=0.00..462.52
rows=8970 width=4) (actual time=0.000..0.000 rows=8934 loops=1)
              Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-09-02'::date))
        ->  Index Scan using sordln_pkey on sordln l  (cost=0.00..96.01
rows=54 width=12) (actual time=0.012..0.125 rows=39 loops=8934)
              Index Cond: ("outer".orderno = l.orderno)
Total runtime: 2828.000 ms

========== ALL CACHE ROWS ==========

QUERY PLAN (Hash Join on all cache rows, sequential only)
Aggregate  (cost=53458.44..53458.44 rows=1 width=8) (actual
time=8906.000..8906.000 rows=1 loops=1)
  ->  Hash Join  (cost=1204.99..50724.94 rows=1093398 width=8) (actual
time=141.000..7089.000 rows=1093397 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Seq Scan on sordln l  (cost=0.00..33118.98 rows=1093398
width=12) (actual time=0.000..2629.000 rows=1093398 loops=1)
        ->  Hash  (cost=1137.99..1137.99 rows=26799 width=4) (actual
time=141.000..141.000 rows=0 loops=1)
              ->  Seq Scan on sord h  (cost=0.00..1137.99 rows=26799
width=4) (actual time=0.000..79.000 rows=26799 loops=1)
Total runtime: 8906.000 ms

QUERY PLAN (Merge Join on all cache rows, indexed only)
Aggregate  (cost=76156.45..76156.45 rows=1 width=8) (actual
time=9422.000..9422.000 rows=1 loops=1)
  ->  Merge Join  (cost=0.00..73422.95 rows=1093398 width=8) (actual
time=0.000..6835.000 rows=1093397 loops=1)
        Merge Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sord_pkey on sord h  (cost=0.00..1268.79
rows=26799 width=4) (actual time=0.000..94.000 rows=26799 loops=1)
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual time=0.000..2773.000
rows=1093398 loops=1)
Total runtime: 9422.000 ms

QUERY PLAN (Nested Loop on all cache rows, Sequential + indexed)
Aggregate  (cost=2594934.26..2594934.26 rows=1 width=8) (actual
time=9234.000..9234.000 rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..2592200.76 rows=1093398 width=8) (actual
time=0.000..6966.000 rows=1093397 loops=1)
        ->  Seq Scan on sord h  (cost=0.00..1137.99 rows=26799 width=4)
(actual time=0.000..110.000 rows=26799 loops=1)
        ->  Index Scan using sordln_pkey on sordln l  (cost=0.00..96.01
rows=54 width=12) (actual time=0.011..0.104 rows=41 loops=26799)
              Index Cond: ("outer".orderno = l.orderno)
Total runtime: 9234.000 ms

========== 10% DISK ROWS ==========

QUERY PLAN (Hash Join on <10% disk rows, indexed + sequential)
Aggregate  (cost=40085.14..40085.14 rows=1 width=8) (actual
time=12813.000..12813.000 rows=1 loops=1)
  ->  Hash Join  (cost=145.11..39814.32 rows=108324 width=8) (actual
time=11188.000..12592.000 rows=96183 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Seq Scan on sordln l  (cost=0.00..33118.98 rows=1093398
width=12) (actual time=31.000..9985.000 rows=1093398 loops=1)
        ->  Hash  (cost=138.48..138.48 rows=2655 width=4) (actual
time=172.000..172.000 rows=0 loops=1)
              ->  Index Scan using sord_date on sord h
(cost=0.00..138.48 rows=2655 width=4) (actual time=47.000..156.000
rows=2646 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-03-16'::date))
Total runtime: 12813.000 ms

QUERY PLAN (Merge Join on <10% disk rows, indexed only)
Aggregate  (cost=63338.43..63338.43 rows=1 width=8) (actual
time=23078.000..23078.000 rows=1 loops=1)
  ->  Merge Join  (cost=289.47..63067.62 rows=108324 width=8) (actual
time=20375.000..22874.000 rows=96183 loops=1)
        Merge Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual
time=63.000..20657.000 rows=737827 loops=1)
        ->  Sort  (cost=289.47..296.11 rows=2655 width=4) (actual
time=171.000..297.000 rows=96174 loops=1)
              Sort Key: h.orderno
              ->  Index Scan using sord_date on sord h
(cost=0.00..138.48 rows=2655 width=4) (actual time=31.000..171.000
rows=2646 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-03-16'::date))
Total runtime: 23078.000 ms

QUERY PLAN (Hash Join on <10% disk rows, indexed only)
Aggregate  (cost=65385.95..65385.95 rows=1 width=8) (actual
time=30734.000..30734.000 rows=1 loops=1)
  ->  Hash Join  (cost=145.11..65115.13 rows=108324 width=8) (actual
time=19546.000..30593.000 rows=96183 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual
time=47.000..27711.000 rows=1093398 loops=1)
        ->  Hash  (cost=138.48..138.48 rows=2655 width=4) (actual
time=187.000..187.000 rows=0 loops=1)
              ->  Index Scan using sord_date on sord h
(cost=0.00..138.48 rows=2655 width=4) (actual time=46.000..171.000
rows=2646 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-03-16'::date))
Total runtime: 30734.000 ms

QUERY PLAN (Nested Loop on <10% disk rows, indexed only)
Aggregate  (cost=257108.11..257108.11 rows=1 width=8) (actual
time=2704.000..2704.000 rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..256837.30 rows=108324 width=8) (actual
time=94.000..2529.000 rows=96183 loops=1)
        ->  Index Scan using sord_date on sord h  (cost=0.00..138.48
rows=2655 width=4) (actual time=32.000..93.000 rows=2646 loops=1)
              Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-03-16'::date))
        ->  Index Scan using sordln_pkey on sordln l  (cost=0.00..96.01
rows=54 width=12) (actual time=0.041..0.814 rows=36 loops=2646)
              Index Cond: ("outer".orderno = l.orderno)
Total runtime: 2704.000 ms

========== 33% DISK ROWS ==========

QUERY PLAN (Hash Join on >33% disk rows, indexed + sequential)
Aggregate  (cost=43645.62..43645.62 rows=1 width=8) (actual
time=13562.000..13562.000 rows=1 loops=1)
  ->  Hash Join  (cost=484.94..42730.67 rows=365976 width=8) (actual
time=8687.000..12985.000 rows=352856 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Seq Scan on sordln l  (cost=0.00..33118.98 rows=1093398
width=12) (actual time=31.000..10106.000 rows=1093398 loops=1)
        ->  Hash  (cost=462.52..462.52 rows=8970 width=4) (actual
time=375.000..375.000 rows=0 loops=1)
              ->  Index Scan using sord_date on sord h
(cost=0.00..462.52 rows=8970 width=4) (actual time=47.000..375.000
rows=8934 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-09-02'::date))
Total runtime: 13562.000 ms

QUERY PLAN (Merge Join on >33% disk rows, indexed only)
Aggregate  (cost=67153.04..67153.04 rows=1 width=8) (actual
time=30672.000..30672.000 rows=1 loops=1)
  ->  Merge Join  (cost=0.00..66238.09 rows=365976 width=8) (actual
time=20297.000..29823.000 rows=352856 loops=1)
        Merge Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sord_pkey on sord h  (cost=0.00..1402.78
rows=8970 width=4) (actual time=578.000..670.000 rows=8934 loops=1)
              Filter: ((orderdate >= '2003-01-01'::date) AND (orderdate
<= '2003-09-02'::date))
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual
time=47.000..26509.000 rows=994500 loops=1)
Total runtime: 30672.000 ms

QUERY PLAN (Nested Loop on >33% disk rows, indexed only)
Aggregate  (cost=868642.40..868642.40 rows=1 width=8) (actual
time=10235.000..10235.000 rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..867727.46 rows=365976 width=8) (actual
time=78.000..9496.000 rows=352856 loops=1)
        ->  Index Scan using sord_date on sord h  (cost=0.00..462.52
rows=8970 width=4) (actual time=32.000..126.000 rows=8934 loops=1)
              Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-09-02'::date))
        ->  Index Scan using sordln_pkey on sordln l  (cost=0.00..96.01
rows=54 width=12) (actual time=0.035..0.912 rows=39 loops=8934)
              Index Cond: ("outer".orderno = l.orderno)
Total runtime: 10235.000 ms

========== ALL DISK ROWS ==========

QUERY PLAN (Hash Join on all disk rows, sequential only)
Aggregate  (cost=53458.44..53458.44 rows=1 width=8) (actual
time=14281.000..14281.000 rows=1 loops=1)
  ->  Hash Join  (cost=1204.99..50724.94 rows=1093398 width=8) (actual
time=719.000..12096.000 rows=1093397 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Seq Scan on sordln l  (cost=0.00..33118.98 rows=1093398
width=12) (actual time=16.000..7389.000 rows=1093398 loops=1)
        ->  Hash  (cost=1137.99..1137.99 rows=26799 width=4) (actual
time=703.000..703.000 rows=0 loops=1)
              ->  Seq Scan on sord h  (cost=0.00..1137.99 rows=26799
width=4) (actual time=0.000..657.000 rows=26799 loops=1)
Total runtime: 14281.000 ms

QUERY PLAN (Merge Join on all disk rows, indexed only)
Aggregate  (cost=76156.45..76156.45 rows=1 width=8) (actual
time=35235.000..35235.000 rows=1 loops=1)
  ->  Merge Join  (cost=0.00..73422.95 rows=1093398 width=8) (actual
time=94.000..33050.000 rows=1093397 loops=1)
        Merge Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sord_pkey on sord h  (cost=0.00..1268.79
rows=26799 width=4) (actual time=47.000..141.000 rows=26799 loops=1)
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual
time=47.000..28250.000 rows=1093398 loops=1)
Total runtime: 35235.000 ms

QUERY PLAN (Nested Loop on all disk rows, indexed + sequential)
Aggregate  (cost=2594934.26..2594934.26 rows=1 width=8) (actual
time=33797.000..33797.000 rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..2592200.76 rows=1093398 width=8) (actual
time=63.000..31744.000 rows=1093397 loops=1)
        ->  Seq Scan on sord h  (cost=0.00..1137.99 rows=26799 width=4)
(actual time=16.000..79.000 rows=26799 loops=1)
        ->  Index Scan using sordln_pkey on sordln l  (cost=0.00..96.01
rows=54 width=12) (actual time=0.039..1.041 rows=41 loops=26799)
              Index Cond: ("outer".orderno = l.orderno)
Total runtime: 33797.000 ms

========== ENVIRONMENT ==========

Athlon XP2500, 768MB, 80GB ATA HDD
PostgreSQL 8.0rc2 on Win2k
shared_buffers = 1000
work_mem = 32768
random_page_cost = 2
sord = 27000 rows, 7MB, pkey = int4, Stats = 100 on OrderDate
sordln = 1 million rows, 173MB, pkey = int4 + int2


pgsql-performance by date:

Previous
From: "Marinos J. Yannikos"
Date:
Subject: Re: GiST indexes and concurrency (tsearch2)
Next
From: PFC
Date:
Subject: Re: GiST indexes and concurrency (tsearch2)