Thread: query planner uses sequencial scan instead of index scan

From:
Daniel Ferreira
Date:

Hi all, iam having trouble with a query, in this query we have parameters, to indicate the starting month and the
endingmonth (commented line in the query). 

You can see the explain using the parameters of month from 1 to 6

EXPLAIN ANALYZE
select pq.nome, mv.data, mv.valor,
          gprd.tipo,
          tprd.tipo,
          prd.nome,
          qtd.ano, qtd.mes, qtd.sum,
          rct.sum,
          rcb.ano, rcb.sigla, rcb.n_recibo, rcb.data, rcb.ncontrib, rcb.nome_cl, rcb.morada_cl, rcb.codpostal_cl
 from parques pq,
           movimentos mv left join a_mov_rcb amr on mv.idmovimento=amr.idmov left join recibos rcb on
amr.idrecibo=rcb.idrecibo,
           (select idmov,ano,mes,idasso,sum(valor) from receitas group by 1,2,3,4) rct,
           (select idmov,ano,mes,idasso,sum(quantidade) from quantidades group by 1,2,3,4) qtd,
           produtos prd,
           tipoprodutos tprd,
           grp_prod gprd,
           a_prk_prod app
  where pq.idparque=mv.idparque
      and pq.idparque=10
      and rct.ano=2009
      and rct.mes between 1 and 6  /* HERE IS THE STARTING AND ENDING MONTH */
      and mv.idtipo_mv=21
      and mv.vivo
      and mv.idmovimento=rct.idmov
      and rct.idmov=qtd.idmov
      and rct.idasso=qtd.idasso
      and rct.ano=qtd.ano
      and rct.mes=qtd.mes
      and rct.idasso=app.idasso and app.idproduto=prd.idproduto
      and prd.idtipoproduto=tprd.idtipoproduto
      and prd.idgrp_prod=gprd.idgrp_prod
  order by mv.data, prd.idproduto, gprd.idgrp_prod, rcb.sigla, rcb.n_recibo, qtd.ano, qtd.mes




"Sort  (cost=23852.81..23852.82 rows=1 width=526) (actual time=339.156..339.197 rows=146 loops=1)"
"  Sort Key: mv.data, prd.idproduto, gprd.idgrp_prod, rcb.sigla, rcb.n_recibo, qtd.ano, qtd.mes"
"  ->  Nested Loop  (cost=23682.31..23852.80 rows=1 width=526) (actual time=319.009..338.801 rows=146 loops=1)"
"        ->  Nested Loop  (cost=23682.31..23851.49 rows=1 width=462) (actual time=318.986..337.758 rows=146 loops=1)"
"              ->  Nested Loop  (cost=23682.31..23851.18 rows=1 width=334) (actual time=318.952..337.159 rows=146
loops=1)"
"                    ->  Nested Loop  (cost=23682.31..23850.87 rows=1 width=202) (actual time=318.917..336.602 rows=146
loops=1)"
"                          ->  Nested Loop  (cost=23682.31..23849.43 rows=1 width=126) (actual time=318.880..335.960
rows=146loops=1)" 
"                                ->  Hash Join  (cost=23682.31..23841.15 rows=1 width=130) (actual
time=318.809..335.161rows=146 loops=1)" 
"                                      Hash Cond: ((rct.idmov = mv.idmovimento) AND (rct.idasso = qtd.idasso) AND
(rct.mes= qtd.mes))" 
"                                      ->  HashAggregate  (cost=5143.05..5201.88 rows=4706 width=24) (actual
time=69.150..79.543rows=14972 loops=1)" 
"                                            ->  Seq Scan on receitas  (cost=0.00..5033.23 rows=8786 width=24) (actual
time=0.236..55.824rows=15668 loops=1)" 
"                                                  Filter: ((ano = 2009) AND (mes >= 1) AND (mes <= 6))"
"                                      ->  Hash  (cost=18539.12..18539.12 rows=8 width=126) (actual
time=249.418..249.418rows=146 loops=1)" 
"                                            ->  Hash Join  (cost=18332.76..18539.12 rows=8 width=126) (actual
time=232.701..249.272rows=146 loops=1)" 
"                                                  Hash Cond: (qtd.idmov = mv.idmovimento)"
"                                                  ->  HashAggregate  (cost=3716.55..3810.31 rows=7501 width=24)
(actualtime=61.735..72.593 rows=15497 loops=1)" 
"                                                        ->  Seq Scan on quantidades  (cost=0.00..3526.18 rows=15230
width=24)(actual time=0.223..48.616 rows=15750 loops=1)" 
"                                                              Filter: (ano = 2009)"
"                                                  ->  Hash  (cost=14588.99..14588.99 rows=2178 width=102) (actual
time=170.719..170.719rows=2559 loops=1)" 
"                                                        ->  Hash Left Join  (cost=7052.05..14588.99 rows=2178
width=102)(actual time=166.942..169.261 rows=2559 loops=1)" 
"                                                              Hash Cond: (amr.idrecibo = rcb.idrecibo)"
"                                                              ->  Hash Left Join  (cost=4706.50..11472.92 rows=2178
width=24)(actual time=77.667..93.502 rows=2559 loops=1)" 
"                                                                    Hash Cond: (mv.idmovimento = amr.idmov)"
"                                                                    ->  Bitmap Heap Scan on movimentos mv
(cost=3058.71..9558.85rows=2178 width=20) (actual time=28.338..35.229 rows=2559 loops=1)" 
"                                                                          Recheck Cond: ((idtipo_mv = 21) AND (10 =
idparque))"
"                                                                          Filter: vivo"
"                                                                          ->  BitmapAnd  (cost=3058.71..3058.71
rows=2205width=0) (actual time=28.196..28.196 rows=0 loops=1)" 
"                                                                                ->  Bitmap Index Scan on
idx_03_idtipo_mv (cost=0.00..583.08 rows=33416 width=0) (actual time=7.307..7.307 rows=46019 loops=1)" 
"                                                                                      Index Cond: (idtipo_mv = 21)"
"                                                                                ->  Bitmap Index Scan on
idx_02_idparque (cost=0.00..2474.29 rows=141577 width=0) (actual time=19.948..19.948 rows=136676 loops=1)" 
"                                                                                      Index Cond: (10 = idparque)"
"                                                                    ->  Hash  (cost=812.13..812.13 rows=49413 width=8)
(actualtime=49.178..49.178 rows=49385 loops=1)" 
"                                                                          ->  Seq Scan on a_mov_rcb amr
(cost=0.00..812.13rows=49413 width=8) (actual time=0.069..24.160 rows=49385 loops=1)" 
"                                                              ->  Hash  (cost=1030.13..1030.13 rows=49313 width=86)
(actualtime=69.384..69.384 rows=49348 loops=1)" 
"                                                                    ->  Seq Scan on recibos rcb  (cost=0.00..1030.13
rows=49313width=86) (actual time=0.018..31.965 rows=49348 loops=1)" 
"                                ->  Index Scan using asso_prk_prod_pkey on a_prk_prod app  (cost=0.00..8.27 rows=1
width=8)(actual time=0.003..0.004 rows=1 loops=146)" 
"                                      Index Cond: (rct.idasso = app.idasso)"
"                          ->  Index Scan using produtos_pkey on produtos prd  (cost=0.00..1.43 rows=1 width=80)
(actualtime=0.002..0.003 rows=1 loops=146)" 
"                                Index Cond: (app.idproduto = prd.idproduto)"
"                    ->  Index Scan using grp_prod_pkey on grp_prod gprd  (cost=0.00..0.30 rows=1 width=136) (actual
time=0.002..0.002rows=1 loops=146)" 
"                          Index Cond: (prd.idgrp_prod = gprd.idgrp_prod)"
"              ->  Index Scan using tipoprodutos_pkey on tipoprodutos tprd  (cost=0.00..0.30 rows=1 width=136) (actual
time=0.002..0.002rows=1 loops=146)" 
"                    Index Cond: (prd.idtipoproduto = tprd.idtipoproduto)"
"        ->  Seq Scan on parques pq  (cost=0.00..1.30 rows=1 width=72) (actual time=0.002..0.005 rows=1 loops=146)"
"              Filter: (idparque = 10)"
"Total runtime: 339.973 ms"

now here is the explain using the parameters from 1 to 4.

"Sort  (cost=23944.24..23944.24 rows=1 width=526) (actual time=1887457.197..1887457.241 rows=124 loops=1)"
"  Sort Key: mv.data, prd.idproduto, gprd.idgrp_prod, rcb.sigla, rcb.n_recibo, qtd.ano, qtd.mes"
"  ->  Nested Loop  (cost=16068.57..23944.23 rows=1 width=526) (actual time=34392.436..1887456.339 rows=124 loops=1)"
"        Join Filter: (qtd.idmov = mv.idmovimento)"
"        ->  Nested Loop  (cost=9016.52..9328.02 rows=1 width=444) (actual time=156.601..834.424 rows=12586 loops=1)"
"              ->  Nested Loop  (cost=9016.52..9327.70 rows=1 width=316) (actual time=156.572..678.675 rows=12586
loops=1)"
"                    ->  Nested Loop  (cost=9016.52..9327.39 rows=1 width=184) (actual time=156.544..579.804 rows=12586
loops=1)"
"                          ->  Hash Join  (cost=9016.52..9325.95 rows=1 width=108) (actual time=156.501..304.851
rows=12586loops=1)" 
"                                Hash Cond: ((qtd.idasso = app.idasso) AND (qtd.idmov = rct.idmov) AND (qtd.mes =
rct.mes))"
"                                ->  HashAggregate  (cost=3716.55..3810.31 rows=7501 width=24) (actual
time=59.596..106.009rows=15507 loops=1)" 
"                                      ->  Seq Scan on quantidades  (cost=0.00..3526.18 rows=15230 width=24) (actual
time=0.203..46.900rows=15760 loops=1)" 
"                                            Filter: (ano = 2009)"
"                                ->  Hash  (cost=5250.56..5250.56 rows=2824 width=104) (actual time=96.788..96.788
rows=12597loops=1)" 
"                                      ->  Hash Join  (cost=5148.19..5250.56 rows=2824 width=104) (actual
time=64.588..85.241rows=12597 loops=1)" 
"                                            Hash Cond: (rct.idasso = app.idasso)"
"                                            ->  HashAggregate  (cost=5099.11..5134.41 rows=2824 width=24) (actual
time=62.779..71.578rows=12597 loops=1)" 
"                                                  ->  Seq Scan on receitas  (cost=0.00..5033.23 rows=5271 width=24)
(actualtime=11.277..51.444 rows=13173 loops=1)" 
"                                                        Filter: ((ano = 2009) AND (mes >= 1) AND (mes <= 4))"
"                                            ->  Hash  (cost=34.16..34.16 rows=1193 width=80) (actual time=1.790..1.790
rows=1193loops=1)" 
"                                                  ->  Nested Loop  (cost=0.00..34.16 rows=1193 width=80) (actual
time=0.025..1.186rows=1193 loops=1)" 
"                                                        ->  Seq Scan on parques pq  (cost=0.00..1.30 rows=1 width=72)
(actualtime=0.012..0.016 rows=1 loops=1)" 
"                                                              Filter: (idparque = 10)"
"                                                        ->  Seq Scan on a_prk_prod app  (cost=0.00..20.93 rows=1193
width=8)(actual time=0.006..0.402 rows=1193 loops=1)" 
"                          ->  Index Scan using produtos_pkey on produtos prd  (cost=0.00..1.43 rows=1 width=80)
(actualtime=0.015..0.016 rows=1 loops=12586)" 
"                                Index Cond: (app.idproduto = prd.idproduto)"
"                    ->  Index Scan using grp_prod_pkey on grp_prod gprd  (cost=0.00..0.30 rows=1 width=136) (actual
time=0.003..0.005rows=1 loops=12586)" 
"                          Index Cond: (prd.idgrp_prod = gprd.idgrp_prod)"
"              ->  Index Scan using tipoprodutos_pkey on tipoprodutos tprd  (cost=0.00..0.30 rows=1 width=136) (actual
time=0.003..0.009rows=1 loops=12586)" 
"                    Index Cond: (prd.idtipoproduto = tprd.idtipoproduto)"
"        ->  Hash Left Join  (cost=7052.05..14588.99 rows=2178 width=102) (actual time=146.667..148.973 rows=2559
loops=12586)"
"              Hash Cond: (amr.idrecibo = rcb.idrecibo)"
"              ->  Hash Left Join  (cost=4706.50..11472.92 rows=2178 width=24) (actual time=68.974..79.270 rows=2559
loops=12586)"
"                    Hash Cond: (mv.idmovimento = amr.idmov)"
"                    ->  Bitmap Heap Scan on movimentos mv  (cost=3058.71..9558.85 rows=2178 width=20) (actual
time=23.592..25.603rows=2559 loops=12586)" 
"                          Recheck Cond: ((idtipo_mv = 21) AND (10 = idparque))"
"                          Filter: vivo"
"                          ->  BitmapAnd  (cost=3058.71..3058.71 rows=2205 width=0) (actual time=23.474..23.474 rows=0
loops=12586)"
"                                ->  Bitmap Index Scan on idx_03_idtipo_mv  (cost=0.00..583.08 rows=33416 width=0)
(actualtime=6.003..6.003 rows=46024 loops=12586)" 
"                                      Index Cond: (idtipo_mv = 21)"
"                                ->  Bitmap Index Scan on idx_02_idparque  (cost=0.00..2474.29 rows=141577 width=0)
(actualtime=16.570..16.570 rows=136676 loops=12586)" 
"                                      Index Cond: (10 = idparque)"
"                    ->  Hash  (cost=812.13..812.13 rows=49413 width=8) (actual time=45.280..45.280 rows=49387
loops=12586)"
"                          ->  Seq Scan on a_mov_rcb amr  (cost=0.00..812.13 rows=49413 width=8) (actual
time=0.010..21.042rows=49387 loops=12586)" 
"              ->  Hash  (cost=1030.13..1030.13 rows=49313 width=86) (actual time=63.760..63.760 rows=49350
loops=12586)"
"                    ->  Seq Scan on recibos rcb  (cost=0.00..1030.13 rows=49313 width=86) (actual time=0.006..26.959
rows=49350loops=12586)" 
"Total runtime: 1887457.849 ms"

has we can see the query planner, decided to do sequencial scan in "a_mov_rcb" table and "recibos", when i set the flag
"enable_seqscan"to false all goes well. 

Anyways i could probably set the "enable_seqscan" always of but i dont know if thats a good idea, because if it was
thatwould be set as off by default. 

Is there anything i could do to go around this?
Or can anyone give me a hint why query planner goes sequencial scan when i change the parameters.


Thanks in advanced



From:
Tom Lane
Date:

Daniel Ferreira <> writes:
> has we can see the query planner, decided to do sequencial scan in "a_mov_rcb" table and "recibos", when i set the
flag"enable_seqscan" to false all goes well. 

It's not really the seqscan that's the problem.  The problem is this
rowcount misestimate:

> "                          ->  Hash Join  (cost=9016.52..9325.95 rows=1 width=108) (actual time=156.501..304.851
rows=12586loops=1)" 
> "                                Hash Cond: ((qtd.idasso = app.idasso) AND (qtd.idmov = rct.idmov) AND (qtd.mes =
rct.mes))"

which is causing the planner to suppose that the remaining joins should
be done as nestloops.  That would be the right thing if there really
were only one row... with twelve thousand of them, it's taking
twelve thousand times longer than the planner expected.

The right fix would be to get the estimate to be better. (Even if it
were 5 or 10 rows the planner would probably avoid the nestloops.)
But I'm not sure how much you can improve it by raising the stats
targets.  This is trying to estimate the size of the join between
two GROUP BY subselects, and the planner is not tremendously good
at that.

A brute force solution might go like this:

1. Select the two GROUP BY sub-results into temp tables.

2. ANALYZE the temp tables.

3. Do the original query using the temp tables.

But it's a pain ...

            regards, tom lane