Thread: Same conditions, different planning?

Same conditions, different planning?

From
Pedro Alves
Date:

    Hi


   I'm having serious problems with a mission critical app that runs on
postgres (and has been running for the past 3 years). It's rather large,
and lately things are not going well. The planner is getting 'nuts',
choosing ways that never end and completly stops the database.


    The strange thing is that running the same query with the same database
with the same conditions in other machine the planner works as expecter and
all goes well.


    All is vaccumed analyzed.


    What can cause this?


    Tkx




--
Pedro Miguel G. Alves            pmalves@think.pt
THINK - Tecnologias de Informação    www.think.pt
Tel:   +351 21 413 46 00  Av. José Gomes Ferreira
Fax:   +351 21 413 46 09     nº 13 1495-139 ALGÉS

Re: Same conditions, different planning?

From
Shridhar Daithankar
Date:
Pedro Alves wrote:
>    I'm having serious problems with a mission critical app that runs on
> postgres (and has been running for the past 3 years). It's rather large,
> and lately things are not going well. The planner is getting 'nuts',
> choosing ways that never end and completly stops the database.
>     The strange thing is that running the same query with the same database
> with the same conditions in other machine the planner works as expecter and
> all goes well.

Can we see explain analyze for the queries, relevant schema and other details?

And what version of postgresql is this? On what platform?

  Shridhar



Re: Same conditions, different planning?

From
Pedro Alves
Date:
On Wed, Oct 22, 2003 at 04:13:45PM +0530, Shridhar Daithankar wrote:
> Pedro Alves wrote:
> >   I'm having serious problems with a mission critical app that runs on
> >postgres (and has been running for the past 3 years). It's rather large,
> >and lately things are not going well. The planner is getting 'nuts',
> >choosing ways that never end and completly stops the database.
> >    The strange thing is that running the same query with the same database
> >with the same conditions in other machine the planner works as expecter and
> >all goes well.
>
> Can we see explain analyze for the queries, relevant schema and other
> details?
>
> And what version of postgresql is this? On what platform?

    Both machines Linux, 7.3.4 (different dists, but it doesn't matter, I
think)

Query:
  SELECT ra.ra_id, ra.ra_reqnum, ra.ra_datacolh,to_char(timestamp_num,'YYYY-MM-DD'),to_char(timestamp_num,'HH24:MI'),
ra.ra_servico,ra.ra_urgente, ra.ra_produto, ra.ra_cama, ra.ra_parcial, ra.ra_vglobal, servico.s_id, servico.s_desc
,ut.ut_id,ut.ut_nome, ut.ut_sexo,ut.ut_data_nasc,ra.ra_modulo, ra.ra_relcolh, ra.ra_notamedica,ra.ra_utilizador FROM
Servicoservico,RequisicaoAnalise ra, Utente ut WHERE ut.ut_id=ra.ra_utente AND ra.ra_servreq=servico.s_id and
ra.ra_servico= 1  AND ra_datacolh <='2003-10-22'  AND  ra.isactive=1 order by ra_reqnum desc LIMIT 80 OFFSET 0;)) 


    Machine 1 (production):

 Limit  (cost=2193.79..2193.99 rows=80 width=156)
   ->  Sort  (cost=2193.79..2194.32 rows=212 width=156)
         Sort Key: ra.ra_reqnum
         ->  Hash Join  (cost=970.41..2185.62 rows=212 width=156)
               Hash Cond: ("outer".ra_servreq = "inner".s_id)
               ->  Merge Join  (cost=968.75..2180.25 rows=212 width=134)
                     Merge Cond: ("outer".ut_id = "inner".ra_utente)
                     ->  Index Scan using utente_pkey on utente ut  (cost=0.00..1109.18 rows=38937 width=43)
        
                     ->  Sort  (cost=968.75..969.97 rows=486 width=91)
                           Sort Key: ra.ra_utente
                           ->  Index Scan using ra_isactive on
requisicaoanalise ra  (cost=0.00..947.07 rows=486 width=91)
                                 Index Cond: (isactive = 1)
                                 Filter: ((ra_servico = 1) AND (ra_datacolh <= '2003-10-22'::date))
               ->  Hash  (cost=1.53..1.53 rows=53 width=22)
                     ->  Seq Scan on servico  (cost=0.00..1.53 rows=53
width=22)
(15 rows)


    Machine 2 (my develop machine):

 Limit  (cost=74.47..74.51 rows=19 width=156)
   ->  Sort  (cost=74.47..74.51 rows=19 width=156)
         Sort Key: ra.ra_reqnum
         ->  Hash Join  (cost=1.66..74.05 rows=19 width=156)
               Hash Cond: ("outer".ra_servreq = "inner".s_id)
               ->  Nested Loop  (cost=0.00..72.05 rows=19 width=134)
                     ->  Index Scan using ra_isactive on requisicaoanalise
ra  (cost=0.00..5.09 rows=19 width=91)
                           Index Cond: (isactive = 1)
                           Filter: ((ra_servico = 1) AND (ra_datacolh <=
'2003-10-22'::date))
                     ->  Index Scan using utente_pkey on utente ut  (cost=0.00..3.43 rows=1 width=43)
                           Index Cond: (ut.ut_id = "outer".ra_utente)
               ->  Hash  (cost=1.53..1.53 rows=53 width=22)
                     ->  Seq Scan on servico  (cost=0.00..1.53 rows=53
width=22)
(13 rows)



Some other relevant info:

select count(*) from requisicaoanalise;
 count
--------
 176328

select count(*) from utente;
 count
-------
 38868


Digging a bit more I find this:

Machine 1:

    explain select count(*) from requisicaoanalise where isactive = 1 AND
(ra_servico = 4) AND (ra_datacolh <= '2003-10-22'::date);
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------
 Aggregate  (cost=948.45..948.45 rows=1 width=0)
   ->  Index Scan using ra_isactive on requisicaoanalise
(cost=0.00..947.07 rows=550 width=0)
         Index Cond: (isactive = 1)
         Filter: ((ra_servico = 4) AND (ra_datacolh <= '2003-10-22'::date))
(4 rows)


Machine 2:

explain select count(*) from requisicaoanalise where isactive = 1 AND
(ra_servico = 4) AND (ra_datacolh <= '2003-10-22'::date);
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Aggregate  (cost=5.12..5.12 rows=1 width=0)
   ->  Index Scan using ra_isactive on requisicaoanalise  (cost=0.00..5.09
rows=12 width=0)
         Index Cond: (isactive = 1)
         Filter: ((ra_servico = 4) AND (ra_datacolh <= '2003-10-22'::date))


How can I have such different costs?!?!?!




--
Pedro Miguel G. Alves            pmalves@think.pt
THINK - Tecnologias de Informação    www.think.pt
Tel:   +351 21 413 46 00  Av. José Gomes Ferreira
Fax:   +351 21 413 46 09     nº 13 1495-139 ALGÉS

Re: Same conditions, different planning?

From
Shridhar Daithankar
Date:
On Wednesday 22 October 2003 16:25, Pedro Alves wrote:

> Machine 1:
>
>     explain select count(*) from requisicaoanalise where isactive = 1 AND
> (ra_servico = 4) AND (ra_datacolh <= '2003-10-22'::date);
>                                           QUERY PLAN
> ---------------------------------------------------------------------------
>-------------------- Aggregate  (cost=948.45..948.45 rows=1 width=0)
>    ->  Index Scan using ra_isactive on requisicaoanalise
> (cost=0.00..947.07 rows=550 width=0)
>          Index Cond: (isactive = 1)
>          Filter: ((ra_servico = 4) AND (ra_datacolh <= '2003-10-22'::date))
> (4 rows)

Could you please run explain analyze rather than just explain for all the
things you posted earlier?

And are there any tuning parameters different on these two machines? RAM size
of HDD setup etc?

Are these two machine absolutely same from hardware and postgresql tuning
point of view?

 Shridhar


Re: Same conditions, different planning?

From
Pedro Alves
Date:

    As a final attempt I tryed a pg_dumpall; initdb; psql -f <file> . It
seems to work as expected.... (!)




On Wed, Oct 22, 2003 at 05:03:32PM +0530, Shridhar Daithankar wrote:
> On Wednesday 22 October 2003 16:25, Pedro Alves wrote:
>
> > Machine 1:
> >
> >     explain select count(*) from requisicaoanalise where isactive = 1 AND
> > (ra_servico = 4) AND (ra_datacolh <= '2003-10-22'::date);
> >                                           QUERY PLAN
> > ---------------------------------------------------------------------------
> >-------------------- Aggregate  (cost=948.45..948.45 rows=1 width=0)
> >    ->  Index Scan using ra_isactive on requisicaoanalise
> > (cost=0.00..947.07 rows=550 width=0)
> >          Index Cond: (isactive = 1)
> >          Filter: ((ra_servico = 4) AND (ra_datacolh <= '2003-10-22'::date))
> > (4 rows)
>
> Could you please run explain analyze rather than just explain for all the
> things you posted earlier?
>
> And are there any tuning parameters different on these two machines? RAM size
> of HDD setup etc?
>
> Are these two machine absolutely same from hardware and postgresql tuning
> point of view?
>
>  Shridhar
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html

--
Pedro Miguel G. Alves            pmalves@think.pt
THINK - Tecnologias de Informação    www.think.pt
Tel:   +351 21 413 46 00  Av. José Gomes Ferreira
Fax:   +351 21 413 46 09     nº 13 1495-139 ALGÉS

Re: Same conditions, different planning?

From
Doug McNaught
Date:
Pedro Alves <pmalves@think.pt> writes:

>     As a final attempt I tryed a pg_dumpall; initdb; psql -f <file> . It
> seems to work as expected.... (!)

It may have been a case of index bloat--did you ever REINDEX?  That
might have fixed it without a dump/reload.

7.4 has fixes for the index bloat problem.

-Doug

Re: Same conditions, different planning?

From
Pedro Alves
Date:
    Hum... no, I didn't :/ I thought vacuum analyze would be enough. Next
time (neve, I hope) I will have that in mind



On Wed, Oct 22, 2003 at 08:36:10AM -0400, Doug McNaught wrote:
> Pedro Alves <pmalves@think.pt> writes:
>
> >     As a final attempt I tryed a pg_dumpall; initdb; psql -f <file> . It
> > seems to work as expected.... (!)
>
> It may have been a case of index bloat--did you ever REINDEX?  That
> might have fixed it without a dump/reload.
>
> 7.4 has fixes for the index bloat problem.
>
> -Doug
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend

--
Pedro Miguel G. Alves            pmalves@think.pt
THINK - Tecnologias de Informação    www.think.pt
Tel:   +351 21 413 46 00  Av. José Gomes Ferreira
Fax:   +351 21 413 46 09     nº 13 1495-139 ALGÉS