Thread: Planner really hates nested loops

Planner really hates nested loops

From
David Brown
Date:
I'm hoping someone can shed some light on these results. The 'factor'
compares the ratios of cost to actual for different plans. Perhaps
nested loops should be given a discount in the planner? The estimates
seem to be out by one and a half orders of magnitude. :(

========== QUERY ==========

SELECT sum(L.Extended)
FROM sord H
JOIN sordln L USING (OrderNo)
[ WHERE H.OrderDate between '2003-01-01' and '2003-03-16' ]
[ WHERE H.OrderDate between '2003-01-01' and '2003-09-02' ]

========== SUMMARY ==========

Join      Cost    Cache    Factor    Disk    Factor
------------------------------------------------------------
  10% ROWS
Hash      40085    4.9s      1.0    12.8s      1.0
Merge     63338    4.1s      1.9    23.1s      0.9
Hash Idx  65386    5.5s      1.5    30.7s      0.7
Nest     257108    0.8s     39.3     2.7s     30.4
  33% ROWS
Hash      43646    5.8s      1.0    13.6s      1.0
Merge     67153    6.0s      1.5    30.7s      0.7
Hash Idx  68946    6.5s      1.4
Nest     868642    2.8s     41.2    10.2s     26.5
  ALL ROWS
Hash      53458    8.9s      1.0    14.3s      1.0
Merge     76156    9.4s      1.3    35.2s      0.6
Nest    2594934    9.2s     47.0    33.8s     20.5

========== 10% CACHE ROWS ==========

QUERY PLAN (Hash Join on <10% cache rows, indexed + sequential)
Aggregate  (cost=40085.14..40085.14 rows=1 width=8) (actual
time=4907.000..4907.000 rows=1 loops=1)
  ->  Hash Join  (cost=145.11..39814.32 rows=108324 width=8) (actual
time=3844.000..4735.000 rows=96183 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Seq Scan on sordln l  (cost=0.00..33118.98 rows=1093398
width=12) (actual time=0.000..2313.000 rows=1093398 loops=1)
        ->  Hash  (cost=138.48..138.48 rows=2655 width=4) (actual
time=16.000..16.000 rows=0 loops=1)
              ->  Index Scan using sord_date on sord h
(cost=0.00..138.48 rows=2655 width=4) (actual time=0.000..0.000
rows=2646 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-03-16'::date))
Total runtime: 4907.000 ms

QUERY PLAN (Merge Join on <10% cache rows, indexed only)
Aggregate  (cost=63338.43..63338.43 rows=1 width=8) (actual
time=4141.000..4141.000 rows=1 loops=1)
  ->  Merge Join  (cost=289.47..63067.62 rows=108324 width=8) (actual
time=3000.000..3896.000 rows=96183 loops=1)
        Merge Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual time=0.000..2058.000
rows=737827 loops=1)
        ->  Sort  (cost=289.47..296.11 rows=2655 width=4) (actual
time=16.000..127.000 rows=96174 loops=1)
              Sort Key: h.orderno
              ->  Index Scan using sord_date on sord h
(cost=0.00..138.48 rows=2655 width=4) (actual time=0.000..0.000
rows=2646 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-03-16'::date))
Total runtime: 4141.000 ms

QUERY PLAN (Hash Join on <10% cache rows, indexed only)
Aggregate  (cost=65385.95..65385.95 rows=1 width=8) (actual
time=5516.000..5516.000 rows=1 loops=1)
  ->  Hash Join  (cost=145.11..65115.13 rows=108324 width=8) (actual
time=3031.000..5376.000 rows=96183 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual time=0.000..3091.000
rows=1093398 loops=1)
        ->  Hash  (cost=138.48..138.48 rows=2655 width=4) (actual
time=0.000..0.000 rows=0 loops=1)
              ->  Index Scan using sord_date on sord h
(cost=0.00..138.48 rows=2655 width=4) (actual time=0.000..0.000
rows=2646 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-03-16'::date))
Total runtime: 5516.000 ms

QUERY PLAN (Nested Loop on <10% cache rows, indexed only)
Aggregate  (cost=257108.11..257108.11 rows=1 width=8) (actual
time=781.000..781.000 rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..256837.30 rows=108324 width=8) (actual
time=0.000..610.000 rows=96183 loops=1)
        ->  Index Scan using sord_date on sord h  (cost=0.00..138.48
rows=2655 width=4) (actual time=0.000..0.000 rows=2646 loops=1)
              Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-03-16'::date))
        ->  Index Scan using sordln_pkey on sordln l  (cost=0.00..96.01
rows=54 width=12) (actual time=0.000..0.118 rows=36 loops=2646)
              Index Cond: ("outer".orderno = l.orderno)
Total runtime: 781.000 ms

========== 33% CACHE ROWS ==========

QUERY PLAN (Hash Join on >33% cache rows, indexed + sequential)
Aggregate  (cost=43645.62..43645.62 rows=1 width=8) (actual
time=5828.000..5828.000 rows=1 loops=1)
  ->  Hash Join  (cost=484.94..42730.67 rows=365976 width=8) (actual
time=2391.000..5078.000 rows=352856 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Seq Scan on sordln l  (cost=0.00..33118.98 rows=1093398
width=12) (actual time=0.000..2234.000 rows=1093398 loops=1)
        ->  Hash  (cost=462.52..462.52 rows=8970 width=4) (actual
time=47.000..47.000 rows=0 loops=1)
              ->  Index Scan using sord_date on sord h
(cost=0.00..462.52 rows=8970 width=4) (actual time=0.000..0.000
rows=8934 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-09-02'::date))
Total runtime: 5828.000 ms

QUERY PLAN (Merge Join on >33% cache rows, indexed only)
Aggregate  (cost=67153.04..67153.04 rows=1 width=8) (actual
time=5985.000..5985.000 rows=1 loops=1)
  ->  Merge Join  (cost=0.00..66238.09 rows=365976 width=8) (actual
time=2953.000..5281.000 rows=352856 loops=1)
        Merge Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sord_pkey on sord h  (cost=0.00..1402.78
rows=8970 width=4) (actual time=31.000..46.000 rows=8934 loops=1)
              Filter: ((orderdate >= '2003-01-01'::date) AND (orderdate
<= '2003-09-02'::date))
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual time=0.000..2485.000
rows=994500 loops=1)
Total runtime: 5985.000 ms

QUERY PLAN (Hash Join on >33% cache rows, indexed only)
Aggregate  (cost=68946.43..68946.43 rows=1 width=8) (actual
time=6531.000..6531.000 rows=1 loops=1)
  ->  Hash Join  (cost=484.94..68031.48 rows=365976 width=8) (actual
time=3031.000..5765.000 rows=352856 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual time=0.000..3075.000
rows=1093398 loops=1)
        ->  Hash  (cost=462.52..462.52 rows=8970 width=4) (actual
time=46.000..46.000 rows=0 loops=1)
              ->  Index Scan using sord_date on sord h
(cost=0.00..462.52 rows=8970 width=4) (actual time=0.000..16.000
rows=8934 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-09-02'::date))
Total runtime: 6531.000 ms

QUERY PLAN (Nested Loop on >33% cache rows, indexed only)
Aggregate  (cost=868642.40..868642.40 rows=1 width=8) (actual
time=2828.000..2828.000 rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..867727.46 rows=365976 width=8) (actual
time=0.000..2171.000 rows=352856 loops=1)
        ->  Index Scan using sord_date on sord h  (cost=0.00..462.52
rows=8970 width=4) (actual time=0.000..0.000 rows=8934 loops=1)
              Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-09-02'::date))
        ->  Index Scan using sordln_pkey on sordln l  (cost=0.00..96.01
rows=54 width=12) (actual time=0.012..0.125 rows=39 loops=8934)
              Index Cond: ("outer".orderno = l.orderno)
Total runtime: 2828.000 ms

========== ALL CACHE ROWS ==========

QUERY PLAN (Hash Join on all cache rows, sequential only)
Aggregate  (cost=53458.44..53458.44 rows=1 width=8) (actual
time=8906.000..8906.000 rows=1 loops=1)
  ->  Hash Join  (cost=1204.99..50724.94 rows=1093398 width=8) (actual
time=141.000..7089.000 rows=1093397 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Seq Scan on sordln l  (cost=0.00..33118.98 rows=1093398
width=12) (actual time=0.000..2629.000 rows=1093398 loops=1)
        ->  Hash  (cost=1137.99..1137.99 rows=26799 width=4) (actual
time=141.000..141.000 rows=0 loops=1)
              ->  Seq Scan on sord h  (cost=0.00..1137.99 rows=26799
width=4) (actual time=0.000..79.000 rows=26799 loops=1)
Total runtime: 8906.000 ms

QUERY PLAN (Merge Join on all cache rows, indexed only)
Aggregate  (cost=76156.45..76156.45 rows=1 width=8) (actual
time=9422.000..9422.000 rows=1 loops=1)
  ->  Merge Join  (cost=0.00..73422.95 rows=1093398 width=8) (actual
time=0.000..6835.000 rows=1093397 loops=1)
        Merge Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sord_pkey on sord h  (cost=0.00..1268.79
rows=26799 width=4) (actual time=0.000..94.000 rows=26799 loops=1)
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual time=0.000..2773.000
rows=1093398 loops=1)
Total runtime: 9422.000 ms

QUERY PLAN (Nested Loop on all cache rows, Sequential + indexed)
Aggregate  (cost=2594934.26..2594934.26 rows=1 width=8) (actual
time=9234.000..9234.000 rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..2592200.76 rows=1093398 width=8) (actual
time=0.000..6966.000 rows=1093397 loops=1)
        ->  Seq Scan on sord h  (cost=0.00..1137.99 rows=26799 width=4)
(actual time=0.000..110.000 rows=26799 loops=1)
        ->  Index Scan using sordln_pkey on sordln l  (cost=0.00..96.01
rows=54 width=12) (actual time=0.011..0.104 rows=41 loops=26799)
              Index Cond: ("outer".orderno = l.orderno)
Total runtime: 9234.000 ms

========== 10% DISK ROWS ==========

QUERY PLAN (Hash Join on <10% disk rows, indexed + sequential)
Aggregate  (cost=40085.14..40085.14 rows=1 width=8) (actual
time=12813.000..12813.000 rows=1 loops=1)
  ->  Hash Join  (cost=145.11..39814.32 rows=108324 width=8) (actual
time=11188.000..12592.000 rows=96183 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Seq Scan on sordln l  (cost=0.00..33118.98 rows=1093398
width=12) (actual time=31.000..9985.000 rows=1093398 loops=1)
        ->  Hash  (cost=138.48..138.48 rows=2655 width=4) (actual
time=172.000..172.000 rows=0 loops=1)
              ->  Index Scan using sord_date on sord h
(cost=0.00..138.48 rows=2655 width=4) (actual time=47.000..156.000
rows=2646 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-03-16'::date))
Total runtime: 12813.000 ms

QUERY PLAN (Merge Join on <10% disk rows, indexed only)
Aggregate  (cost=63338.43..63338.43 rows=1 width=8) (actual
time=23078.000..23078.000 rows=1 loops=1)
  ->  Merge Join  (cost=289.47..63067.62 rows=108324 width=8) (actual
time=20375.000..22874.000 rows=96183 loops=1)
        Merge Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual
time=63.000..20657.000 rows=737827 loops=1)
        ->  Sort  (cost=289.47..296.11 rows=2655 width=4) (actual
time=171.000..297.000 rows=96174 loops=1)
              Sort Key: h.orderno
              ->  Index Scan using sord_date on sord h
(cost=0.00..138.48 rows=2655 width=4) (actual time=31.000..171.000
rows=2646 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-03-16'::date))
Total runtime: 23078.000 ms

QUERY PLAN (Hash Join on <10% disk rows, indexed only)
Aggregate  (cost=65385.95..65385.95 rows=1 width=8) (actual
time=30734.000..30734.000 rows=1 loops=1)
  ->  Hash Join  (cost=145.11..65115.13 rows=108324 width=8) (actual
time=19546.000..30593.000 rows=96183 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual
time=47.000..27711.000 rows=1093398 loops=1)
        ->  Hash  (cost=138.48..138.48 rows=2655 width=4) (actual
time=187.000..187.000 rows=0 loops=1)
              ->  Index Scan using sord_date on sord h
(cost=0.00..138.48 rows=2655 width=4) (actual time=46.000..171.000
rows=2646 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-03-16'::date))
Total runtime: 30734.000 ms

QUERY PLAN (Nested Loop on <10% disk rows, indexed only)
Aggregate  (cost=257108.11..257108.11 rows=1 width=8) (actual
time=2704.000..2704.000 rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..256837.30 rows=108324 width=8) (actual
time=94.000..2529.000 rows=96183 loops=1)
        ->  Index Scan using sord_date on sord h  (cost=0.00..138.48
rows=2655 width=4) (actual time=32.000..93.000 rows=2646 loops=1)
              Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-03-16'::date))
        ->  Index Scan using sordln_pkey on sordln l  (cost=0.00..96.01
rows=54 width=12) (actual time=0.041..0.814 rows=36 loops=2646)
              Index Cond: ("outer".orderno = l.orderno)
Total runtime: 2704.000 ms

========== 33% DISK ROWS ==========

QUERY PLAN (Hash Join on >33% disk rows, indexed + sequential)
Aggregate  (cost=43645.62..43645.62 rows=1 width=8) (actual
time=13562.000..13562.000 rows=1 loops=1)
  ->  Hash Join  (cost=484.94..42730.67 rows=365976 width=8) (actual
time=8687.000..12985.000 rows=352856 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Seq Scan on sordln l  (cost=0.00..33118.98 rows=1093398
width=12) (actual time=31.000..10106.000 rows=1093398 loops=1)
        ->  Hash  (cost=462.52..462.52 rows=8970 width=4) (actual
time=375.000..375.000 rows=0 loops=1)
              ->  Index Scan using sord_date on sord h
(cost=0.00..462.52 rows=8970 width=4) (actual time=47.000..375.000
rows=8934 loops=1)
                    Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-09-02'::date))
Total runtime: 13562.000 ms

QUERY PLAN (Merge Join on >33% disk rows, indexed only)
Aggregate  (cost=67153.04..67153.04 rows=1 width=8) (actual
time=30672.000..30672.000 rows=1 loops=1)
  ->  Merge Join  (cost=0.00..66238.09 rows=365976 width=8) (actual
time=20297.000..29823.000 rows=352856 loops=1)
        Merge Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sord_pkey on sord h  (cost=0.00..1402.78
rows=8970 width=4) (actual time=578.000..670.000 rows=8934 loops=1)
              Filter: ((orderdate >= '2003-01-01'::date) AND (orderdate
<= '2003-09-02'::date))
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual
time=47.000..26509.000 rows=994500 loops=1)
Total runtime: 30672.000 ms

QUERY PLAN (Nested Loop on >33% disk rows, indexed only)
Aggregate  (cost=868642.40..868642.40 rows=1 width=8) (actual
time=10235.000..10235.000 rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..867727.46 rows=365976 width=8) (actual
time=78.000..9496.000 rows=352856 loops=1)
        ->  Index Scan using sord_date on sord h  (cost=0.00..462.52
rows=8970 width=4) (actual time=32.000..126.000 rows=8934 loops=1)
              Index Cond: ((orderdate >= '2003-01-01'::date) AND
(orderdate <= '2003-09-02'::date))
        ->  Index Scan using sordln_pkey on sordln l  (cost=0.00..96.01
rows=54 width=12) (actual time=0.035..0.912 rows=39 loops=8934)
              Index Cond: ("outer".orderno = l.orderno)
Total runtime: 10235.000 ms

========== ALL DISK ROWS ==========

QUERY PLAN (Hash Join on all disk rows, sequential only)
Aggregate  (cost=53458.44..53458.44 rows=1 width=8) (actual
time=14281.000..14281.000 rows=1 loops=1)
  ->  Hash Join  (cost=1204.99..50724.94 rows=1093398 width=8) (actual
time=719.000..12096.000 rows=1093397 loops=1)
        Hash Cond: ("outer".orderno = "inner".orderno)
        ->  Seq Scan on sordln l  (cost=0.00..33118.98 rows=1093398
width=12) (actual time=16.000..7389.000 rows=1093398 loops=1)
        ->  Hash  (cost=1137.99..1137.99 rows=26799 width=4) (actual
time=703.000..703.000 rows=0 loops=1)
              ->  Seq Scan on sord h  (cost=0.00..1137.99 rows=26799
width=4) (actual time=0.000..657.000 rows=26799 loops=1)
Total runtime: 14281.000 ms

QUERY PLAN (Merge Join on all disk rows, indexed only)
Aggregate  (cost=76156.45..76156.45 rows=1 width=8) (actual
time=35235.000..35235.000 rows=1 loops=1)
  ->  Merge Join  (cost=0.00..73422.95 rows=1093398 width=8) (actual
time=94.000..33050.000 rows=1093397 loops=1)
        Merge Cond: ("outer".orderno = "inner".orderno)
        ->  Index Scan using sord_pkey on sord h  (cost=0.00..1268.79
rows=26799 width=4) (actual time=47.000..141.000 rows=26799 loops=1)
        ->  Index Scan using sordln_pkey on sordln l
(cost=0.00..58419.79 rows=1093398 width=12) (actual
time=47.000..28250.000 rows=1093398 loops=1)
Total runtime: 35235.000 ms

QUERY PLAN (Nested Loop on all disk rows, indexed + sequential)
Aggregate  (cost=2594934.26..2594934.26 rows=1 width=8) (actual
time=33797.000..33797.000 rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..2592200.76 rows=1093398 width=8) (actual
time=63.000..31744.000 rows=1093397 loops=1)
        ->  Seq Scan on sord h  (cost=0.00..1137.99 rows=26799 width=4)
(actual time=16.000..79.000 rows=26799 loops=1)
        ->  Index Scan using sordln_pkey on sordln l  (cost=0.00..96.01
rows=54 width=12) (actual time=0.039..1.041 rows=41 loops=26799)
              Index Cond: ("outer".orderno = l.orderno)
Total runtime: 33797.000 ms

========== ENVIRONMENT ==========

Athlon XP2500, 768MB, 80GB ATA HDD
PostgreSQL 8.0rc2 on Win2k
shared_buffers = 1000
work_mem = 32768
random_page_cost = 2
sord = 27000 rows, 7MB, pkey = int4, Stats = 100 on OrderDate
sordln = 1 million rows, 173MB, pkey = int4 + int2


Re: Planner really hates nested loops

From
Tom Lane
Date:
David Brown <time@bigpond.net.au> writes:
> I'm hoping someone can shed some light on these results.

Not without a lot more detail on how you *got* the results.  What
exactly did you do to force the various plan choices?  (I see some
ridiculous choices of indexscans, for instance, suggesting improper use
of enable_seqscan in some cases.)  And what's the "cache rows" and "disk
rows" stuff, and how do you know that what you were measuring is
actually what you think it is?  I have zero confidence in
Windows-atop-ATA as a platform for measuring disk-related behaviors,
because I don't think you can control or even know what caching is
going on.

            regards, tom lane

Re: Planner really hates nested loops

From
"Magnus Hagander"
Date:
> > I'm hoping someone can shed some light on these results.
>
> Not without a lot more detail on how you *got* the results.
> What exactly did you do to force the various plan choices?
> (I see some ridiculous choices of indexscans, for instance,
> suggesting improper use of enable_seqscan in some cases.)
> And what's the "cache rows" and "disk rows" stuff, and how do
> you know that what you were measuring is actually what you
> think it is?  I have zero confidence in Windows-atop-ATA as a
> platform for measuring disk-related behaviors, because I
> don't think you can control or even know what caching is going on.

You can control the writeback-cache from Device Manager->(the
disk)->Policies. And if that is turned off, fsync definitly should write
through, just as on *nix. (write-cache is on by default, no surprise)

AFAIK, you can't control what is cached for reading.

//Magnus

Re: Planner really hates nested loops

From
"Merlin Moncure"
Date:
Magnus wrote:
> > > I'm hoping someone can shed some light on these results.
> >
> > Not without a lot more detail on how you *got* the results.
> > What exactly did you do to force the various plan choices?
> > (I see some ridiculous choices of indexscans, for instance,
> > suggesting improper use of enable_seqscan in some cases.)
> > And what's the "cache rows" and "disk rows" stuff, and how do
> > you know that what you were measuring is actually what you
> > think it is?  I have zero confidence in Windows-atop-ATA as a
> > platform for measuring disk-related behaviors, because I
> > don't think you can control or even know what caching is going on.
>
> You can control the writeback-cache from Device Manager->(the
> disk)->Policies. And if that is turned off, fsync definitly should
write
> through, just as on *nix. (write-cache is on by default, no surprise)

There is some truth to what Tom is saying, we just can't seem to get our
development server to *quit* syncing with fsync=on, even though we have
the Promise raid controller (yeah, I know) configured to cache writes.

IOW, with certain configurations I just can't seem to delegate sync
responsibility to the raid controller.  It is a matter of record that
certain crappy drives lie about caching but, IMO this is more of a
driver issue than a O/S issue.  (aside: I have become quite a believer
in Western Digital parts, lately!)

Merlin

Re: Planner really hates nested loops

From
David Brown
Date:
Tom Lane wrote:

>What exactly did you do to force the various plan choices?  (I see some
>ridiculous choices of indexscans, for instance, suggesting improper use
>of enable_seqscan in some cases.)
>
Except for forcing a hash with indexes (to show that increased use of
indexes is not necessarily good), the "ridiculous choices of indexscans"
are straight from the planner, i.e. I did not use enable_seqscan.
Obviously, the alternative join methods were obtained by disabling hash
joins and merge joins.

>  And what's the "cache rows" and "disk
>rows" stuff, and how do you know that what you were measuring is
>actually what you think it is?  I have zero confidence in
>Windows-atop-ATA as a platform for measuring disk-related behaviors,
>because I don't think you can control or even know what caching is
>going on.
>
>
The terms are just abbreviated headings to make it easier to check what
you're looking at. "Cache" refers to repeated runs without disk I/O.
"Disk" refers to a completely initialized system with no PostgreSQL data
in the OS cache (i.e. after a reboot - this is Benchmarking 101). All
results were verified with *at least* two runs at different times. This
is not to say the "disk" results are an accurate or absolute benchmark,
but they're useful as a reference when looking at the cached results.

In any case, I can get the same "cached" results by increasing buffers
to take up most of the memory and thereby make them the defacto cache.

With respect, could we please focus on the point of this thread? I've
spent a great deal of time experimenting with PostgreSQL over the last
couple of months, including reading every known web page regarding
tuning and following every post in this list in that period. I'm
confident that my results here are what most people will experience when
trying PostgreSQL, and I'd like to help in a constructive way.

Re: Planner really hates nested loops

From
Bruce Momjian
Date:
Magnus Hagander wrote:
> > > I'm hoping someone can shed some light on these results.
> >
> > Not without a lot more detail on how you *got* the results.
> > What exactly did you do to force the various plan choices?
> > (I see some ridiculous choices of indexscans, for instance,
> > suggesting improper use of enable_seqscan in some cases.)
> > And what's the "cache rows" and "disk rows" stuff, and how do
> > you know that what you were measuring is actually what you
> > think it is?  I have zero confidence in Windows-atop-ATA as a
> > platform for measuring disk-related behaviors, because I
> > don't think you can control or even know what caching is going on.
>
> You can control the writeback-cache from Device Manager->(the
> disk)->Policies. And if that is turned off, fsync definitly should write
> through, just as on *nix. (write-cache is on by default, no surprise)
>
> AFAIK, you can't control what is cached for reading.

Are you saying that fsync() doesn't write to the platters by default on
Win32?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073