Thread: Bad plan choice nestloop vs. hashjoin

Bad plan choice nestloop vs. hashjoin

From
Kenneth Marshall
Date:
Dear performance group:

We have just upgraded our monitoring server software and
now the following query for graphing the data performs
abysmally with the default settings. Here is the results
of the EXPLAIN ANALYZE run with nestloops enabled:

SET enable_nestloop = 'on';
EXPLAIN SELECT g.graphid FROM graphs g,graphs_items gi,items i,hosts_groups hg,rights r,users_groups ug WHERE
(g.graphid/100000000000000)in (0)  AND gi.graphid=g.graphid AND i.itemid=gi.itemid AND hg.hostid=i.hostid AND
r.id=hg.groupid AND r.groupid=ug.usrgrpid AND ug.userid=20 AND r.permission>=2 AND NOT EXISTS(  SELECT gii.graphid
FROMgraphs_items gii, items ii  WHERE gii.graphid=g.graphid  AND gii.itemid=ii.itemid  AND EXISTS(  SELECT hgg.groupid
FROMhosts_groups hgg, rights rr, users_groups ugg  WHERE ii.hostid=hgg.hostid  AND rr.id=hgg.groupid  AND
rr.groupid=ugg.usrgrpid AND ugg.userid=20 AND rr.permission<2)) AND  (g.graphid IN
(2,3,4,5,386,387,969,389,971,972,973,446,447,448,449,450,451,471,456,470,473,477,472,474,475,476,478,479,480,481,482,483,484,459,614,655,658,645,490,492,489,493,496,495,498,497,499,501,500,502,974,558,559,562,566,563,564,565,567,568,569,570,571,535,572,573,534,536,538,539,540,541,542,543,544,545,537,546,547,548,552,553,554,555,556,549,550,551,557,577,578,579,580,574,576,581,835,587,588,589,590,560,561,836,591,592,593,594,595,827,389,495,498,497,597,598,599,975,978,999,1004,604,605,606,679,616,634,635,636,637,638,618,629,630,631,632,633,671,682,669,670,678,679,680,674,672,676,673,675,677,681,682,683,683,644,652,829,681,687,698,685,686,705,706,707,708,830,945,946,710,716,712,714,713,709,718,721,720,719,723,724,747,749,750,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,772,774,775,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,777,776,977,824,823,826,825,829,832,833,835,581,836,842,852,854,839,840,838,853,855,847,848,944,846,859,850,899,901,902,903,864,865,866,867,976,979,939,941,942,943,906,907,908,909,910,868,969,991,950,955,964,966,952,953,962,965,967,959,961,968,1001,1002,1003,986,987,988,994,995,996,1008,1006,1007,1009,1010));
-----






                       QUERY PLAN




                                                         

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=94.07..10304.00 rows=1 width=8) (actual time=194.557..27975.338 rows=607 loops=1)
   Join Filter: (r.groupid = ug.usrgrpid)
   ->  Seq Scan on users_groups ug  (cost=0.00..1.15 rows=1 width=8) (actual time=0.020..0.026 rows=1 loops=1)
         Filter: (userid = 20)
   ->  Nested Loop  (cost=94.07..10302.65 rows=16 width=16) (actual time=98.126..27965.748 rows=5728 loops=1)
         ->  Nested Loop  (cost=94.07..10301.76 rows=2 width=16) (actual time=98.085..27933.529 rows=928 loops=1)
               ->  Nested Loop  (cost=94.07..10301.20 rows=2 width=16) (actual time=98.074..27924.076 rows=837 loops=1)
                     ->  Nested Loop  (cost=94.07..10299.07 rows=2 width=16) (actual time=98.063..27914.106 rows=837
loops=1)
                           ->  Nested Loop Anti Join  (cost=94.07..10294.64 rows=1 width=8) (actual
time=98.049..27907.702rows=281 loops=1) 
                                 Join Filter: (gii.graphid = g.graphid)
                                 ->  Bitmap Heap Scan on graphs g  (cost=94.07..233.17 rows=1 width=8) (actual
time=0.529..1.772rows=281 loops=1) 
                                       Recheck Cond: (graphid = ANY
('{2,3,4,5,386,387,969,389,971,972,973,446,447,448,449,450,451,471,456,470,473,477,472,474,475,476,478,479,480,481,482,483,484,459,614,655,658,645,490,492,489,493,496,495,498,497,499,501,500,502,974,558,559,562,566,563,564,565,567,568,569,570,571,535,572,573,534,536,538,539,540,541,542,543,544,545,537,546,547,548,552,553,554,555,556,549,550,551,557,577,578,579,580,574,576,581,835,587,588,589,590,560,561,836,591,592,593,594,595,827,389,495,498,497,597,598,599,975,978,999,1004,604,605,606,679,616,634,635,636,637,638,618,629,630,631,632,633,671,682,669,670,678,679,680,674,672,676,673,675,677,681,682,683,683,644,652,829,681,687,698,685,686,705,706,707,708,830,945,946,710,716,712,714,713,709,718,721,720,719,723,724,747,749,750,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,772,774,775,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,777,776,977,824,823,826,825,829,832,833,835,581,836,842,852,854,839,840,838,853,855,847,848,944,846,859,850,899,901,902,903,864,865,866,867,976,979,939,941,942,943,906,907,908,909,910,868,969,991,950,955,964,966,952,953,962,965,967,959,961,968,1001,1002,1003,986,987,988,994,995,996,1008,1006,1007,1009,1010}'::bigint[]))
                                       Filter: ((graphid / 100000000000000::bigint) = 0)
                                       ->  Bitmap Index Scan on graphs_pkey  (cost=0.00..94.07 rows=246 width=0)
(actualtime=0.507..0.507 rows=294 loops=1) 
                                             Index Cond: (graphid = ANY
('{2,3,4,5,386,387,969,389,971,972,973,446,447,448,449,450,451,471,456,470,473,477,472,474,475,476,478,479,480,481,482,483,484,459,614,655,658,645,490,492,489,493,496,495,498,497,499,501,500,502,974,558,559,562,566,563,564,565,567,568,569,570,571,535,572,573,534,536,538,539,540,541,542,543,544,545,537,546,547,548,552,553,554,555,556,549,550,551,557,577,578,579,580,574,576,581,835,587,588,589,590,560,561,836,591,592,593,594,595,827,389,495,498,497,597,598,599,975,978,999,1004,604,605,606,679,616,634,635,636,637,638,618,629,630,631,632,633,671,682,669,670,678,679,680,674,672,676,673,675,677,681,682,683,683,644,652,829,681,687,698,685,686,705,706,707,708,830,945,946,710,716,712,714,713,709,718,721,720,719,723,724,747,749,750,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,772,774,775,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,777,776,977,824,823,826,825,829,832,833,835,581,836,842,852,854,839,840,838,853,855,847,848,944,846,859,850,899,901,902,903,864,865,866,867,976,979,939,941,942,943,906,907,908,909,910,868,969,991,950,955,964,966,952,953,962,965,967,959,961,968,1001,1002,1003,986,987,988,994,995,996,1008,1006,1007,1009,1010}'::bigint[]))
                                 ->  Nested Loop  (cost=0.00..17449.43 rows=1954 width=8) (actual time=99.304..99.304
rows=0loops=281) 
                                       ->  Index Scan using graphs_items_2 on graphs_items gii  (cost=0.00..69.83
rows=1954width=16) (actual time=0.013..3.399 rows=1954 loops=281) 
                                       ->  Index Scan using items_pkey on items ii  (cost=0.00..8.88 rows=1 width=8)
(actualtime=0.046..0.046 rows=0 loops=549074) 
                                             Index Cond: (ii.itemid = gii.itemid)
                                             Filter: (alternatives: SubPlan 1 or hashed SubPlan 2)
                                             SubPlan 1
                                               ->  Nested Loop  (cost=0.00..7.83 rows=1 width=0) (actual
time=0.040..0.040rows=0 loops=549074) 
                                                     Join Filter: (rr.groupid = ugg.usrgrpid)
                                                     ->  Nested Loop  (cost=0.00..6.67 rows=1 width=8) (actual
time=0.037..0.037rows=0 loops=549074) 
                                                           Join Filter: (hgg.groupid = rr.id)
                                                           ->  Index Scan using hosts_groups_1 on hosts_groups hgg
(cost=0.00..4.27rows=1 width=8) (actual time=0.003..0.005 rows=1 loops=549074) 
                                                                 Index Cond: ($0 = hostid)
                                                           ->  Seq Scan on rights rr  (cost=0.00..2.39 rows=1 width=16)
(actualtime=0.027..0.027 rows=0 loops=532214) 
                                                                 Filter: (rr.permission < 2)
                                                     ->  Seq Scan on users_groups ugg  (cost=0.00..1.15 rows=1 width=8)
(neverexecuted) 
                                                           Filter: (ugg.userid = 20)
                                             SubPlan 2
                                               ->  Nested Loop  (cost=2.34..8.13 rows=1 width=8) (never executed)
                                                     ->  Nested Loop  (cost=0.00..3.55 rows=1 width=8) (never executed)
                                                           Join Filter: (rr.groupid = ugg.usrgrpid)
                                                           ->  Seq Scan on rights rr  (cost=0.00..2.39 rows=1 width=16)
(neverexecuted) 
                                                                 Filter: (permission < 2)
                                                           ->  Seq Scan on users_groups ugg  (cost=0.00..1.15 rows=1
width=8)(never executed) 
                                                                 Filter: (ugg.userid = 20)
                                                     ->  Bitmap Heap Scan on hosts_groups hgg  (cost=2.34..4.45 rows=11
width=16)(never executed) 
                                                           Recheck Cond: (hgg.groupid = rr.id)
                                                           ->  Bitmap Index Scan on hosts_groups_2  (cost=0.00..2.33
rows=11width=0) (never executed) 
                                                                 Index Cond: (hgg.groupid = rr.id)
                           ->  Index Scan using graphs_items_2 on graphs_items gi  (cost=0.00..4.41 rows=2 width=16)
(actualtime=0.005..0.010 rows=3 loops=281) 
                                 Index Cond: (gi.graphid = g.graphid)
                     ->  Index Scan using items_pkey on items i  (cost=0.00..1.05 rows=1 width=16) (actual
time=0.004..0.006rows=1 loops=837) 
                           Index Cond: (i.itemid = gi.itemid)
               ->  Index Scan using hosts_groups_1 on hosts_groups hg  (cost=0.00..0.27 rows=1 width=16) (actual
time=0.003..0.005rows=1 loops=837) 
                     Index Cond: (hg.hostid = i.hostid)
         ->  Index Scan using rights_2 on rights r  (cost=0.00..0.38 rows=5 width=16) (actual time=0.004..0.015 rows=6
loops=928)
               Index Cond: (r.id = hg.groupid)
               Filter: (r.permission >= 2)
 Total runtime: 27976.516 ms
(53 rows)

And here is the  the same plan with nestloops disabled:

SET enable_nestloop = 'off';
EXPLAIN ANALYZE SELECT g.graphid FROM graphs g,graphs_items gi,items i,hosts_groups hg,rights r,users_groups ug WHERE
(g.graphid/100000000000000)in (0)  AND gi.graphid=g.graphid AND i.itemid=gi.itemid AND hg.hostid=i.hostid AND
r.id=hg.groupid AND r.groupid=ug.usrgrpid AND ug.userid=20 AND r.permission>=2 AND NOT EXISTS(  SELECT gii.graphid
FROMgraphs_items gii, items ii  WHERE gii.graphid=g.graphid  AND gii.itemid=ii.itemid  AND EXISTS(  SELECT hgg.groupid
FROMhosts_groups hgg, rights rr, users_groups ugg  WHERE ii.hostid=hgg.hostid  AND rr.id=hgg.groupid  AND
rr.groupid=ugg.usrgrpid AND ugg.userid=20 AND rr.permission<2)) AND  (g.graphid IN
(2,3,4,5,386,387,969,389,971,972,973,446,447,448,449,450,451,471,456,470,473,477,472,474,475,476,478,479,480,481,482,483,484,459,614,655,658,645,490,492,489,493,496,495,498,497,499,501,500,502,974,558,559,562,566,563,564,565,567,568,569,570,571,535,572,573,534,536,538,539,540,541,542,543,544,545,537,546,547,548,552,553,554,555,556,549,550,551,557,577,578,579,580,574,576,581,835,587,588,589,590,560,561,836,591,592,593,594,595,827,389,495,498,497,597,598,599,975,978,999,1004,604,605,606,679,616,634,635,636,637,638,618,629,630,631,632,633,671,682,669,670,678,679,680,674,672,676,673,675,677,681,682,683,683,644,652,829,681,687,698,685,686,705,706,707,708,830,945,946,710,716,712,714,713,709,718,721,720,719,723,724,747,749,750,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,772,774,775,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,777,776,977,824,823,826,825,829,832,833,835,581,836,842,852,854,839,840,838,853,855,847,848,944,846,859,850,899,901,902,903,864,865,866,867,976,979,939,941,942,943,906,907,908,909,910,868,969,991,950,955,964,966,952,953,962,965,967,959,961,968,1001,1002,1003,986,987,988,994,995,996,1008,1006,1007,1009,1010));
-----






                                QUERY PLAN




                                                                           

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=106463.65..106466.57 rows=1 width=8) (actual time=67.513..68.637 rows=607 loops=1)
   Hash Cond: (r.id = hg.groupid)
   ->  Hash Join  (cost=1.16..4.05 rows=8 width=8) (actual time=0.255..0.416 rows=15 loops=1)
         Hash Cond: (r.groupid = ug.usrgrpid)
         ->  Seq Scan on rights r  (cost=0.00..2.39 rows=111 width=16) (actual time=0.015..0.184 rows=111 loops=1)
               Filter: (permission >= 2)
         ->  Hash  (cost=1.15..1.15 rows=1 width=8) (actual time=0.022..0.022 rows=1 loops=1)
               ->  Seq Scan on users_groups ug  (cost=0.00..1.15 rows=1 width=8) (actual time=0.010..0.015 rows=1
loops=1)
                     Filter: (userid = 20)
   ->  Hash  (cost=106462.46..106462.46 rows=2 width=16) (actual time=67.225..67.225 rows=928 loops=1)
         ->  Hash Join  (cost=106457.61..106462.46 rows=2 width=16) (actual time=63.608..65.720 rows=928 loops=1)
               Hash Cond: (hg.hostid = i.hostid)
               ->  Seq Scan on hosts_groups hg  (cost=0.00..4.06 rows=206 width=16) (actual time=0.008..0.305 rows=206
loops=1)
               ->  Hash  (cost=106457.58..106457.58 rows=2 width=16) (actual time=63.565..63.565 rows=837 loops=1)
                     ->  Hash Anti Join  (cost=105576.85..106457.58 rows=2 width=16) (actual time=19.955..62.166
rows=837loops=1) 
                           Hash Cond: (g.graphid = gii.graphid)
                           ->  Hash Join  (cost=282.09..1162.79 rows=2 width=16) (actual time=13.021..52.860 rows=837
loops=1)
                                 Hash Cond: (i.itemid = gi.itemid)
                                 ->  Seq Scan on items i  (cost=0.00..831.13 rows=13213 width=16) (actual
time=0.004..22.955rows=13213 loops=1) 
                                 ->  Hash  (cost=282.07..282.07 rows=2 width=16) (actual time=9.890..9.890 rows=837
loops=1)
                                       ->  Hash Join  (cost=233.18..282.07 rows=2 width=16) (actual time=1.514..8.514
rows=837loops=1) 
                                             Hash Cond: (gi.graphid = g.graphid)
                                             ->  Seq Scan on graphs_items gi  (cost=0.00..41.54 rows=1954 width=16)
(actualtime=0.005..2.713 rows=1954 loops=1) 
                                             ->  Hash  (cost=233.17..233.17 rows=1 width=8) (actual time=1.489..1.489
rows=281loops=1) 
                                                   ->  Bitmap Heap Scan on graphs g  (cost=94.07..233.17 rows=1
width=8)(actual time=0.526..1.056 rows=281 loops=1) 
                                                         Recheck Cond: (graphid = ANY
('{2,3,4,5,386,387,969,389,971,972,973,446,447,448,449,450,451,471,456,470,473,477,472,474,475,476,478,479,480,481,482,483,484,459,614,655,658,645,490,492,489,493,496,495,498,497,499,501,500,502,974,558,559,562,566,563,564,565,567,568,569,570,571,535,572,573,534,536,538,539,540,541,542,543,544,545,537,546,547,548,552,553,554,555,556,549,550,551,557,577,578,579,580,574,576,581,835,587,588,589,590,560,561,836,591,592,593,594,595,827,389,495,498,497,597,598,599,975,978,999,1004,604,605,606,679,616,634,635,636,637,638,618,629,630,631,632,633,671,682,669,670,678,679,680,674,672,676,673,675,677,681,682,683,683,644,652,829,681,687,698,685,686,705,706,707,708,830,945,946,710,716,712,714,713,709,718,721,720,719,723,724,747,749,750,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,772,774,775,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,777,776,977,824,823,826,825,829,832,833,835,581,836,842,852,854,839,840,838,853,855,847,848,944,846,859,850,899,901,902,903,864,865,866,867,976,979,939,941,942,943,906,907,908,909,910,868,969,991,950,955,964,966,952,953,962,965,967,959,961,968,1001,1002,1003,986,987,988,994,995,996,1008,1006,1007,1009,1010}'::bigint[]))
                                                         Filter: ((graphid / 100000000000000::bigint) = 0)
                                                         ->  Bitmap Index Scan on graphs_pkey  (cost=0.00..94.07
rows=246width=0) (actual time=0.504..0.504 rows=294 loops=1) 
                                                               Index Cond: (graphid = ANY
('{2,3,4,5,386,387,969,389,971,972,973,446,447,448,449,450,451,471,456,470,473,477,472,474,475,476,478,479,480,481,482,483,484,459,614,655,658,645,490,492,489,493,496,495,498,497,499,501,500,502,974,558,559,562,566,563,564,565,567,568,569,570,571,535,572,573,534,536,538,539,540,541,542,543,544,545,537,546,547,548,552,553,554,555,556,549,550,551,557,577,578,579,580,574,576,581,835,587,588,589,590,560,561,836,591,592,593,594,595,827,389,495,498,497,597,598,599,975,978,999,1004,604,605,606,679,616,634,635,636,637,638,618,629,630,631,632,633,671,682,669,670,678,679,680,674,672,676,673,675,677,681,682,683,683,644,652,829,681,687,698,685,686,705,706,707,708,830,945,946,710,716,712,714,713,709,718,721,720,719,723,724,747,749,750,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,772,774,775,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,777,776,977,824,823,826,825,829,832,833,835,581,836,842,852,854,839,840,838,853,855,847,848,944,846,859,850,899,901,902,903,864,865,866,867,976,979,939,941,942,943,906,907,908,909,910,868,969,991,950,955,964,966,952,953,962,965,967,959,961,968,1001,1002,1003,986,987,988,994,995,996,1008,1006,1007,1009,1010}'::bigint[]))
                           ->  Hash  (cost=105270.33..105270.33 rows=1954 width=8) (actual time=6.914..6.914 rows=0
loops=1)
                                 ->  Hash Join  (cost=65.97..105270.33 rows=1954 width=8) (actual time=6.912..6.912
rows=0loops=1) 
                                       Hash Cond: (ii.itemid = gii.itemid)
                                       ->  Index Scan using items_pkey on items ii  (cost=0.00..105110.51 rows=6606
width=8)(actual time=6.907..6.907 rows=0 loops=1) 
                                             Filter: (alternatives: SubPlan 1 or hashed SubPlan 2)
                                             SubPlan 1
                                               ->  Hash Join  (cost=3.56..7.86 rows=1 width=0) (never executed)
                                                     Hash Cond: (rr.groupid = ugg.usrgrpid)
                                                     ->  Hash Join  (cost=2.40..6.68 rows=1 width=8) (never executed)
                                                           Hash Cond: (hgg.groupid = rr.id)
                                                           ->  Index Scan using hosts_groups_1 on hosts_groups hgg
(cost=0.00..4.27rows=1 width=8) (never executed) 
                                                                 Index Cond: ($0 = hostid)
                                                           ->  Hash  (cost=2.39..2.39 rows=1 width=16) (never executed)
                                                                 ->  Seq Scan on rights rr  (cost=0.00..2.39 rows=1
width=16)(never executed) 
                                                                       Filter: (permission < 2)
                                                     ->  Hash  (cost=1.15..1.15 rows=1 width=8) (never executed)
                                                           ->  Seq Scan on users_groups ugg  (cost=0.00..1.15 rows=1
width=8)(never executed) 
                                                                 Filter: (userid = 20)
                                             SubPlan 2
                                               ->  Hash Join  (cost=3.58..8.56 rows=1 width=8) (actual
time=0.062..0.062rows=0 loops=1) 
                                                     Hash Cond: (hgg.groupid = rr.id)
                                                     ->  Seq Scan on hosts_groups hgg  (cost=0.00..4.06 rows=206
width=16)(actual time=0.006..0.006 rows=1 loops=1) 
                                                     ->  Hash  (cost=3.56..3.56 rows=1 width=8) (actual
time=0.040..0.040rows=0 loops=1) 
                                                           ->  Hash Join  (cost=1.16..3.56 rows=1 width=8) (actual
time=0.037..0.037rows=0 loops=1) 
                                                                 Hash Cond: (rr.groupid = ugg.usrgrpid)
                                                                 ->  Seq Scan on rights rr  (cost=0.00..2.39 rows=1
width=16)(actual time=0.035..0.035 rows=0 loops=1) 
                                                                       Filter: (permission < 2)
                                                                 ->  Hash  (cost=1.15..1.15 rows=1 width=8) (never
executed)
                                                                       ->  Seq Scan on users_groups ugg
(cost=0.00..1.15rows=1 width=8) (never executed) 
                                                                             Filter: (userid = 20)
                                       ->  Hash  (cost=41.54..41.54 rows=1954 width=16) (never executed)
                                             ->  Seq Scan on graphs_items gii  (cost=0.00..41.54 rows=1954 width=16)
(neverexecuted) 
 Total runtime: 69.630 ms
(62 rows)


It looks like it thinks that there will only be 1 row hitting
the nestloop, when in fact there are 607 rows. Does anyone have
any ideas for how to coerce PostgreSQL into using the hashjoin
plan instead? I am running 8.4.2 and have done a database-wide
vacuum/analyze.

Regards,
Ken

Re: Bad plan choice nestloop vs. hashjoin

From
"Kevin Grittner"
Date:
Kenneth Marshall <ktm@rice.edu> wrote:

> with the default settings

Do you mean you haven't changed any settings in your postgresql.conf
file from their defaults?

-Kevin

Re: Bad plan choice nestloop vs. hashjoin

From
Kenneth Marshall
Date:
On Fri, Jan 15, 2010 at 04:58:57PM -0600, Kevin Grittner wrote:
> Kenneth Marshall <ktm@rice.edu> wrote:
>
> > with the default settings
>
> Do you mean you haven't changed any settings in your postgresql.conf
> file from their defaults?
>
> -Kevin
>
Sorry, here are the differences from the default:

max_connections = 100                   # (change requires restart)
shared_buffers = 256MB                  # min 128kB or max_connections*16kB
work_mem = 16MB                         # min 64kB
maintenance_work_mem = 512MB            # min 1MB
synchronous_commit = off                # immediate fsync at commit
wal_buffers = 256kB                     # min 32kB
checkpoint_segments = 30                # in logfile segments, min 1, 16MB each
seq_page_cost = 1.0                     # measured on an arbitrary scale
random_page_cost = 2.0                  # same scale as above
effective_cache_size = 12GB
log_min_duration_statement = 5000

The machine has 16GB of RAM and the DB is currently about 8GB. It
is going to grow much larger as information is acquired.

Cheers,
Ken

Re: Bad plan choice nestloop vs. hashjoin

From
Tom Lane
Date:
Kenneth Marshall <ktm@rice.edu> writes:
> We have just upgraded our monitoring server software and
> now the following query for graphing the data performs
> abysmally with the default settings. Here is the results
> of the EXPLAIN ANALYZE run with nestloops enabled:

That plan seems a bit wacko --- I don't see a reason for it to be using
an indexscan on items ii.  Can you extract a self-contained test case?

            regards, tom lane

Re: Bad plan choice nestloop vs. hashjoin

From
Tom Lane
Date:
Kenneth Marshall <ktm@rice.edu> writes:
> We have just upgraded our monitoring server software and
> now the following query for graphing the data performs
> abysmally with the default settings. Here is the results
> of the EXPLAIN ANALYZE run with nestloops enabled:

I poked at this a bit more and now think I see where the problem is.
The thing that would be easiest for you to do something about is
the misestimation here:

>                            ->  Nested Loop Anti Join  (cost=94.07..10294.64 rows=1 width=8) (actual
time=98.049..27907.702rows=281 loops=1) 
>                                  Join Filter: (gii.graphid = g.graphid)
>                                  ->  Bitmap Heap Scan on graphs g  (cost=94.07..233.17 rows=1 width=8) (actual
time=0.529..1.772rows=281 loops=1) 
>                                        Recheck Cond: (graphid = ANY
('{2,3,4,5,386,387,969,389,971,972,973,446,447,448,449,450,451,471,456,470,473,477,472,474,475,476,478,479,480,481,482,483,484,459,614,655,658,645,490,492,489,493,496,495,498,497,499,501,500,502,974,558,559,562,566,563,564,565,567,568,569,570,571,535,572,573,534,536,538,539,540,541,542,543,544,545,537,546,547,548,552,553,554,555,556,549,550,551,557,577,578,579,580,574,576,581,835,587,588,589,590,560,561,836,591,592,593,594,595,827,389,495,498,497,597,598,599,975,978,999,1004,604,605,606,679,616,634,635,636,637,638,618,629,630,631,632,633,671,682,669,670,678,679,680,674,672,676,673,675,677,681,682,683,683,644,652,829,681,687,698,685,686,705,706,707,708,830,945,946,710,716,712,714,713,709,718,721,720,719,723,724,747,749,750,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,772,774,775,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,777,776,977,824,823,826,825,829,832,833,835,581!

,836,842,852,854,839,840,838,853,855,847,848,944,846,859,850,899,901,902,903,864,865,866,867,976,979,939,941,942,943,906,907,908,909,910,868,969,991,950,955,964,966,952,953,962,965,967,959,961,968,1001,1002,1003,986,987,988,994,995,996,1008,1006,1007,1009,1010}'::bigint[]))
>                                        Filter: ((graphid / 100000000000000::bigint) = 0)
>                                        ->  Bitmap Index Scan on graphs_pkey  (cost=0.00..94.07 rows=246 width=0)
(actualtime=0.507..0.507 rows=294 loops=1) 
>                                              Index Cond: (graphid = ANY
('{2,3,4,5,386,387,969,389,971,972,973,446,447,448,449,450,451,471,456,470,473,477,472,474,475,476,478,479,480,481,482,483,484,459,614,655,658,645,490,492,489,493,496,495,498,497,499,501,500,502,974,558,559,562,566,563,564,565,567,568,569,570,571,535,572,573,534,536,538,539,540,541,542,543,544,545,537,546,547,548,552,553,554,555,556,549,550,551,557,577,578,579,580,574,576,581,835,587,588,589,590,560,561,836,591,592,593,594,595,827,389,495,498,497,597,598,599,975,978,999,1004,604,605,606,679,616,634,635,636,637,638,618,629,630,631,632,633,671,682,669,670,678,679,680,674,672,676,673,675,677,681,682,683,683,644,652,829,681,687,698,685,686,705,706,707,708,830,945,946,710,716,712,714,713,709,718,721,720,719,723,724,747,749,750,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,772,774,775,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,777,776,977,824,823,826,825,829,832,833,835!

,581,836,842,852,854,839,840,838,853,855,847,848,944,846,859,850,899,901,902,903,864,865,866,867,976,979,939,941,942,943,906,907,908,909,910,868,969,991,950,955,964,966,952,953,962,965,967,959,961,968,1001,1002,1003,986,987,988,994,995,996,1008,1006,1007,1009,1010}'::bigint[]))

The estimate of the ANY condition is not too bad (246 vs 294 actual).  But it
hasn't got any ability to deal with the "(graphid / 100000000000000::bigint) = 0"
filter condition, and is falling back to a default selectivity estimate for
that, which IIRC is just 0.005 --- but actually, that condition doesn't
eliminate any rows at all.  Do you need that condition in the first
place?  Can you persuade your client-side software to eliminate it when
it's impossible based on the ANY list?  Or at least recast it to
something more easily estimatable, like "graphid < 100000000000000"?

If you really have to have the condition just like that, I'd advise
creating an index on "(graphid / 100000000000000::bigint)".  That would
cause ANALYZE to accumulate statistics on that expression, which'd
result in a far better estimate.

The reason that this misestimate hammers it so hard is that the
inside of the nestloop looks like

>                                  ->  Nested Loop  (cost=0.00..17449.43 rows=1954 width=8) (actual time=99.304..99.304
rows=0loops=281) 
>                                        ->  Index Scan using graphs_items_2 on graphs_items gii  (cost=0.00..69.83
rows=1954width=16) (actual time=0.013..3.399 rows=1954 loops=281) 
>                                        ->  Index Scan using items_pkey on items ii  (cost=0.00..8.88 rows=1 width=8)
(actualtime=0.046..0.046 rows=0 loops=549074) 
>                                              Index Cond: (ii.itemid = gii.itemid)
>                                              Filter: (alternatives: SubPlan 1 or hashed SubPlan 2)
>                                              SubPlan 1
>                                                ->  Nested Loop  (cost=0.00..7.83 rows=1 width=0) (actual
time=0.040..0.040rows=0 loops=549074) 
>                                                      Join Filter: (rr.groupid = ugg.usrgrpid)
>                                                      ->  Nested Loop  (cost=0.00..6.67 rows=1 width=8) (actual
time=0.037..0.037rows=0 loops=549074) 
>                                                            Join Filter: (hgg.groupid = rr.id)
>                                                            ->  Index Scan using hosts_groups_1 on hosts_groups hgg
(cost=0.00..4.27rows=1 width=8) (actual time=0.003..0.005 rows=1 loops=549074) 
>                                                                  Index Cond: ($0 = hostid)
>                                                            ->  Seq Scan on rights rr  (cost=0.00..2.39 rows=1
width=16)(actual time=0.027..0.027 rows=0 loops=532214) 
>                                                                  Filter: (rr.permission < 2)
>                                                      ->  Seq Scan on users_groups ugg  (cost=0.00..1.15 rows=1
width=8)(never executed) 
>                                                            Filter: (ugg.userid = 20)
>                                              SubPlan 2
>                                                ->  Nested Loop  (cost=2.34..8.13 rows=1 width=8) (never executed)
>                                                      ->  Nested Loop  (cost=0.00..3.55 rows=1 width=8) (never
executed)
>                                                            Join Filter: (rr.groupid = ugg.usrgrpid)
>                                                            ->  Seq Scan on rights rr  (cost=0.00..2.39 rows=1
width=16)(never executed) 
>                                                                  Filter: (permission < 2)
>                                                            ->  Seq Scan on users_groups ugg  (cost=0.00..1.15 rows=1
width=8)(never executed) 
>                                                                  Filter: (ugg.userid = 20)
>                                                      ->  Bitmap Heap Scan on hosts_groups hgg  (cost=2.34..4.45
rows=11width=16) (never executed) 
>                                                            Recheck Cond: (hgg.groupid = rr.id)
>                                                            ->  Bitmap Index Scan on hosts_groups_2  (cost=0.00..2.33
rows=11width=0) (never executed) 
>                                                                  Index Cond: (hgg.groupid = rr.id)

The alternative subplans are variant implementations of the inner EXISTS
test.  Unfortunately, it's choosing to go with the "retail" lookup, not
knowing that that's going to wind up being executed 549074 times.
If it had gone to the "wholesale" hashtable implementation, this would
probably have run a whole lot faster.  (We have it in mind to allow the
executor to switch to the other implementation on-the-fly when it
becomes clear that the planner misjudged the rowcount, but it's not done
yet.)

However, I'm curious about your statement that this used to perform
better.  Used to perform better on what, and what was the plan back
then?  I don't believe that pre-8.4 PG would have done better than
8.4 on either of these points.  It certainly wouldn't have won on the
EXISTS subplan, because before 8.4 that would always have been done in
the "retail" style.

            regards, tom lane

Re: Bad plan choice nestloop vs. hashjoin

From
Kenneth Marshall
Date:
On Mon, Jan 18, 2010 at 12:13:24PM -0500, Tom Lane wrote:
> Kenneth Marshall <ktm@rice.edu> writes:
> > We have just upgraded our monitoring server software and
> > now the following query for graphing the data performs
> > abysmally with the default settings. Here is the results
> > of the EXPLAIN ANALYZE run with nestloops enabled:
>
> I poked at this a bit more and now think I see where the problem is.
> The thing that would be easiest for you to do something about is
> the misestimation here:
>
> >                            ->  Nested Loop Anti Join  (cost=94.07..10294.64 rows=1 width=8) (actual
time=98.049..27907.702rows=281 loops=1) 
> >                                  Join Filter: (gii.graphid = g.graphid)
> >                                  ->  Bitmap Heap Scan on graphs g  (cost=94.07..233.17 rows=1 width=8) (actual
time=0.529..1.772rows=281 loops=1) 
> >                                        Recheck Cond: (graphid = ANY
('{2,3,4,5,386,387,969,389,971,972,973,446,447,448,449,450,451,471,456,470,473,477,472,474,475,476,478,479,480,481,482,483,484,459,614,655,658,645,490,492,489,493,496,495,498,497,499,501,500,502,974,558,559,562,566,563,564,565,567,568,569,570,571,535,572,573,534,536,538,539,540,541,542,543,544,545,537,546,547,548,552,553,554,555,556,549,550,551,557,577,578,579,580,574,576,581,835,587,588,589,590,560,561,836,591,592,593,594,595,827,389,495,498,497,597,598,599,975,978,999,1004,604,605,606,679,616,634,635,636,637,638,618,629,630,631,632,633,671,682,669,670,678,679,680,674,672,676,673,675,677,681,682,683,683,644,652,829,681,687,698,685,686,705,706,707,708,830,945,946,710,716,712,714,713,709,718,721,720,719,723,724,747,749,750,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,772,774,775,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,777,776,977,824,823,826,825,829,832,833,835,581!
>
,836,842,852,854,839,840,838,853,855,847,848,944,846,859,850,899,901,902,903,864,865,866,867,976,979,939,941,942,943,906,907,908,909,910,868,969,991,950,955,964,966,952,953,962,965,967,959,961,968,1001,1002,1003,986,987,988,994,995,996,1008,1006,1007,1009,1010}'::bigint[]))
> >                                        Filter: ((graphid / 100000000000000::bigint) = 0)
> >                                        ->  Bitmap Index Scan on graphs_pkey  (cost=0.00..94.07 rows=246 width=0)
(actualtime=0.507..0.507 rows=294 loops=1) 
> >                                              Index Cond: (graphid = ANY
('{2,3,4,5,386,387,969,389,971,972,973,446,447,448,449,450,451,471,456,470,473,477,472,474,475,476,478,479,480,481,482,483,484,459,614,655,658,645,490,492,489,493,496,495,498,497,499,501,500,502,974,558,559,562,566,563,564,565,567,568,569,570,571,535,572,573,534,536,538,539,540,541,542,543,544,545,537,546,547,548,552,553,554,555,556,549,550,551,557,577,578,579,580,574,576,581,835,587,588,589,590,560,561,836,591,592,593,594,595,827,389,495,498,497,597,598,599,975,978,999,1004,604,605,606,679,616,634,635,636,637,638,618,629,630,631,632,633,671,682,669,670,678,679,680,674,672,676,673,675,677,681,682,683,683,644,652,829,681,687,698,685,686,705,706,707,708,830,945,946,710,716,712,714,713,709,718,721,720,719,723,724,747,749,750,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,772,774,775,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,777,776,977,824,823,826,825,829,832,833,835!
>
,581,836,842,852,854,839,840,838,853,855,847,848,944,846,859,850,899,901,902,903,864,865,866,867,976,979,939,941,942,943,906,907,908,909,910,868,969,991,950,955,964,966,952,953,962,965,967,959,961,968,1001,1002,1003,986,987,988,994,995,996,1008,1006,1007,1009,1010}'::bigint[]))
>
> The estimate of the ANY condition is not too bad (246 vs 294 actual).  But it
> hasn't got any ability to deal with the "(graphid / 100000000000000::bigint) = 0"
> filter condition, and is falling back to a default selectivity estimate for
> that, which IIRC is just 0.005 --- but actually, that condition doesn't
> eliminate any rows at all.  Do you need that condition in the first
> place?  Can you persuade your client-side software to eliminate it when
> it's impossible based on the ANY list?  Or at least recast it to
> something more easily estimatable, like "graphid < 100000000000000"?
>
> If you really have to have the condition just like that, I'd advise
> creating an index on "(graphid / 100000000000000::bigint)".  That would
> cause ANALYZE to accumulate statistics on that expression, which'd
> result in a far better estimate.
>
> The reason that this misestimate hammers it so hard is that the
> inside of the nestloop looks like
>
> >                                  ->  Nested Loop  (cost=0.00..17449.43 rows=1954 width=8) (actual
time=99.304..99.304rows=0 loops=281) 
> >                                        ->  Index Scan using graphs_items_2 on graphs_items gii  (cost=0.00..69.83
rows=1954width=16) (actual time=0.013..3.399 rows=1954 loops=281) 
> >                                        ->  Index Scan using items_pkey on items ii  (cost=0.00..8.88 rows=1
width=8)(actual time=0.046..0.046 rows=0 loops=549074) 
> >                                              Index Cond: (ii.itemid = gii.itemid)
> >                                              Filter: (alternatives: SubPlan 1 or hashed SubPlan 2)
> >                                              SubPlan 1
> >                                                ->  Nested Loop  (cost=0.00..7.83 rows=1 width=0) (actual
time=0.040..0.040rows=0 loops=549074) 
> >                                                      Join Filter: (rr.groupid = ugg.usrgrpid)
> >                                                      ->  Nested Loop  (cost=0.00..6.67 rows=1 width=8) (actual
time=0.037..0.037rows=0 loops=549074) 
> >                                                            Join Filter: (hgg.groupid = rr.id)
> >                                                            ->  Index Scan using hosts_groups_1 on hosts_groups hgg
(cost=0.00..4.27rows=1 width=8) (actual time=0.003..0.005 rows=1 loops=549074) 
> >                                                                  Index Cond: ($0 = hostid)
> >                                                            ->  Seq Scan on rights rr  (cost=0.00..2.39 rows=1
width=16)(actual time=0.027..0.027 rows=0 loops=532214) 
> >                                                                  Filter: (rr.permission < 2)
> >                                                      ->  Seq Scan on users_groups ugg  (cost=0.00..1.15 rows=1
width=8)(never executed) 
> >                                                            Filter: (ugg.userid = 20)
> >                                              SubPlan 2
> >                                                ->  Nested Loop  (cost=2.34..8.13 rows=1 width=8) (never executed)
> >                                                      ->  Nested Loop  (cost=0.00..3.55 rows=1 width=8) (never
executed)
> >                                                            Join Filter: (rr.groupid = ugg.usrgrpid)
> >                                                            ->  Seq Scan on rights rr  (cost=0.00..2.39 rows=1
width=16)(never executed) 
> >                                                                  Filter: (permission < 2)
> >                                                            ->  Seq Scan on users_groups ugg  (cost=0.00..1.15
rows=1width=8) (never executed) 
> >                                                                  Filter: (ugg.userid = 20)
> >                                                      ->  Bitmap Heap Scan on hosts_groups hgg  (cost=2.34..4.45
rows=11width=16) (never executed) 
> >                                                            Recheck Cond: (hgg.groupid = rr.id)
> >                                                            ->  Bitmap Index Scan on hosts_groups_2
(cost=0.00..2.33rows=11 width=0) (never executed) 
> >                                                                  Index Cond: (hgg.groupid = rr.id)
>
> The alternative subplans are variant implementations of the inner EXISTS
> test.  Unfortunately, it's choosing to go with the "retail" lookup, not
> knowing that that's going to wind up being executed 549074 times.
> If it had gone to the "wholesale" hashtable implementation, this would
> probably have run a whole lot faster.  (We have it in mind to allow the
> executor to switch to the other implementation on-the-fly when it
> becomes clear that the planner misjudged the rowcount, but it's not done
> yet.)
>
> However, I'm curious about your statement that this used to perform
> better.  Used to perform better on what, and what was the plan back
> then?  I don't believe that pre-8.4 PG would have done better than
> 8.4 on either of these points.  It certainly wouldn't have won on the
> EXISTS subplan, because before 8.4 that would always have been done in
> the "retail" style.
>
>             regards, tom lane
>

Hi Tom,

Your update beat me. I was just about to send you the trimmed example.
The performance problem was not present previously because this code
is new in this release, a .0 release. This section of code is completely
new. I will post a bug report to the developers with your analysis
and hopefully they can include the fix in the .1 release. I am going
to create the index so I do not have to track source code changes
to their application. The performance is still 2X slower than with
enable_nestloop set to off, but that is great compared to 200X slower
without the index:

CREATE INDEX fix_bad_stat_estimate ON graphs ((graphid / 100000000000000::bigint));

Thank you again.

Regards,
Ken