Thread: Same conditions, different planning?
Hi I'm having serious problems with a mission critical app that runs on postgres (and has been running for the past 3 years). It's rather large, and lately things are not going well. The planner is getting 'nuts', choosing ways that never end and completly stops the database. The strange thing is that running the same query with the same database with the same conditions in other machine the planner works as expecter and all goes well. All is vaccumed analyzed. What can cause this? Tkx -- Pedro Miguel G. Alves pmalves@think.pt THINK - Tecnologias de Informação www.think.pt Tel: +351 21 413 46 00 Av. José Gomes Ferreira Fax: +351 21 413 46 09 nº 13 1495-139 ALGÉS
Pedro Alves wrote: > I'm having serious problems with a mission critical app that runs on > postgres (and has been running for the past 3 years). It's rather large, > and lately things are not going well. The planner is getting 'nuts', > choosing ways that never end and completly stops the database. > The strange thing is that running the same query with the same database > with the same conditions in other machine the planner works as expecter and > all goes well. Can we see explain analyze for the queries, relevant schema and other details? And what version of postgresql is this? On what platform? Shridhar
On Wed, Oct 22, 2003 at 04:13:45PM +0530, Shridhar Daithankar wrote: > Pedro Alves wrote: > > I'm having serious problems with a mission critical app that runs on > >postgres (and has been running for the past 3 years). It's rather large, > >and lately things are not going well. The planner is getting 'nuts', > >choosing ways that never end and completly stops the database. > > The strange thing is that running the same query with the same database > >with the same conditions in other machine the planner works as expecter and > >all goes well. > > Can we see explain analyze for the queries, relevant schema and other > details? > > And what version of postgresql is this? On what platform? Both machines Linux, 7.3.4 (different dists, but it doesn't matter, I think) Query: SELECT ra.ra_id, ra.ra_reqnum, ra.ra_datacolh,to_char(timestamp_num,'YYYY-MM-DD'),to_char(timestamp_num,'HH24:MI'), ra.ra_servico,ra.ra_urgente, ra.ra_produto, ra.ra_cama, ra.ra_parcial, ra.ra_vglobal, servico.s_id, servico.s_desc ,ut.ut_id,ut.ut_nome, ut.ut_sexo,ut.ut_data_nasc,ra.ra_modulo, ra.ra_relcolh, ra.ra_notamedica,ra.ra_utilizador FROM Servicoservico,RequisicaoAnalise ra, Utente ut WHERE ut.ut_id=ra.ra_utente AND ra.ra_servreq=servico.s_id and ra.ra_servico= 1 AND ra_datacolh <='2003-10-22' AND ra.isactive=1 order by ra_reqnum desc LIMIT 80 OFFSET 0;)) Machine 1 (production): Limit (cost=2193.79..2193.99 rows=80 width=156) -> Sort (cost=2193.79..2194.32 rows=212 width=156) Sort Key: ra.ra_reqnum -> Hash Join (cost=970.41..2185.62 rows=212 width=156) Hash Cond: ("outer".ra_servreq = "inner".s_id) -> Merge Join (cost=968.75..2180.25 rows=212 width=134) Merge Cond: ("outer".ut_id = "inner".ra_utente) -> Index Scan using utente_pkey on utente ut (cost=0.00..1109.18 rows=38937 width=43) -> Sort (cost=968.75..969.97 rows=486 width=91) Sort Key: ra.ra_utente -> Index Scan using ra_isactive on requisicaoanalise ra (cost=0.00..947.07 rows=486 width=91) Index Cond: (isactive = 1) Filter: ((ra_servico = 1) AND (ra_datacolh <= '2003-10-22'::date)) -> Hash (cost=1.53..1.53 rows=53 width=22) -> Seq Scan on servico (cost=0.00..1.53 rows=53 width=22) (15 rows) Machine 2 (my develop machine): Limit (cost=74.47..74.51 rows=19 width=156) -> Sort (cost=74.47..74.51 rows=19 width=156) Sort Key: ra.ra_reqnum -> Hash Join (cost=1.66..74.05 rows=19 width=156) Hash Cond: ("outer".ra_servreq = "inner".s_id) -> Nested Loop (cost=0.00..72.05 rows=19 width=134) -> Index Scan using ra_isactive on requisicaoanalise ra (cost=0.00..5.09 rows=19 width=91) Index Cond: (isactive = 1) Filter: ((ra_servico = 1) AND (ra_datacolh <= '2003-10-22'::date)) -> Index Scan using utente_pkey on utente ut (cost=0.00..3.43 rows=1 width=43) Index Cond: (ut.ut_id = "outer".ra_utente) -> Hash (cost=1.53..1.53 rows=53 width=22) -> Seq Scan on servico (cost=0.00..1.53 rows=53 width=22) (13 rows) Some other relevant info: select count(*) from requisicaoanalise; count -------- 176328 select count(*) from utente; count ------- 38868 Digging a bit more I find this: Machine 1: explain select count(*) from requisicaoanalise where isactive = 1 AND (ra_servico = 4) AND (ra_datacolh <= '2003-10-22'::date); QUERY PLAN ----------------------------------------------------------------------------------------------- Aggregate (cost=948.45..948.45 rows=1 width=0) -> Index Scan using ra_isactive on requisicaoanalise (cost=0.00..947.07 rows=550 width=0) Index Cond: (isactive = 1) Filter: ((ra_servico = 4) AND (ra_datacolh <= '2003-10-22'::date)) (4 rows) Machine 2: explain select count(*) from requisicaoanalise where isactive = 1 AND (ra_servico = 4) AND (ra_datacolh <= '2003-10-22'::date); QUERY PLAN -------------------------------------------------------------------------------------------- Aggregate (cost=5.12..5.12 rows=1 width=0) -> Index Scan using ra_isactive on requisicaoanalise (cost=0.00..5.09 rows=12 width=0) Index Cond: (isactive = 1) Filter: ((ra_servico = 4) AND (ra_datacolh <= '2003-10-22'::date)) How can I have such different costs?!?!?! -- Pedro Miguel G. Alves pmalves@think.pt THINK - Tecnologias de Informação www.think.pt Tel: +351 21 413 46 00 Av. José Gomes Ferreira Fax: +351 21 413 46 09 nº 13 1495-139 ALGÉS
On Wednesday 22 October 2003 16:25, Pedro Alves wrote: > Machine 1: > > explain select count(*) from requisicaoanalise where isactive = 1 AND > (ra_servico = 4) AND (ra_datacolh <= '2003-10-22'::date); > QUERY PLAN > --------------------------------------------------------------------------- >-------------------- Aggregate (cost=948.45..948.45 rows=1 width=0) > -> Index Scan using ra_isactive on requisicaoanalise > (cost=0.00..947.07 rows=550 width=0) > Index Cond: (isactive = 1) > Filter: ((ra_servico = 4) AND (ra_datacolh <= '2003-10-22'::date)) > (4 rows) Could you please run explain analyze rather than just explain for all the things you posted earlier? And are there any tuning parameters different on these two machines? RAM size of HDD setup etc? Are these two machine absolutely same from hardware and postgresql tuning point of view? Shridhar
As a final attempt I tryed a pg_dumpall; initdb; psql -f <file> . It seems to work as expected.... (!) On Wed, Oct 22, 2003 at 05:03:32PM +0530, Shridhar Daithankar wrote: > On Wednesday 22 October 2003 16:25, Pedro Alves wrote: > > > Machine 1: > > > > explain select count(*) from requisicaoanalise where isactive = 1 AND > > (ra_servico = 4) AND (ra_datacolh <= '2003-10-22'::date); > > QUERY PLAN > > --------------------------------------------------------------------------- > >-------------------- Aggregate (cost=948.45..948.45 rows=1 width=0) > > -> Index Scan using ra_isactive on requisicaoanalise > > (cost=0.00..947.07 rows=550 width=0) > > Index Cond: (isactive = 1) > > Filter: ((ra_servico = 4) AND (ra_datacolh <= '2003-10-22'::date)) > > (4 rows) > > Could you please run explain analyze rather than just explain for all the > things you posted earlier? > > And are there any tuning parameters different on these two machines? RAM size > of HDD setup etc? > > Are these two machine absolutely same from hardware and postgresql tuning > point of view? > > Shridhar > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html -- Pedro Miguel G. Alves pmalves@think.pt THINK - Tecnologias de Informação www.think.pt Tel: +351 21 413 46 00 Av. José Gomes Ferreira Fax: +351 21 413 46 09 nº 13 1495-139 ALGÉS
Pedro Alves <pmalves@think.pt> writes: > As a final attempt I tryed a pg_dumpall; initdb; psql -f <file> . It > seems to work as expected.... (!) It may have been a case of index bloat--did you ever REINDEX? That might have fixed it without a dump/reload. 7.4 has fixes for the index bloat problem. -Doug
Hum... no, I didn't :/ I thought vacuum analyze would be enough. Next time (neve, I hope) I will have that in mind On Wed, Oct 22, 2003 at 08:36:10AM -0400, Doug McNaught wrote: > Pedro Alves <pmalves@think.pt> writes: > > > As a final attempt I tryed a pg_dumpall; initdb; psql -f <file> . It > > seems to work as expected.... (!) > > It may have been a case of index bloat--did you ever REINDEX? That > might have fixed it without a dump/reload. > > 7.4 has fixes for the index bloat problem. > > -Doug > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend -- Pedro Miguel G. Alves pmalves@think.pt THINK - Tecnologias de Informação www.think.pt Tel: +351 21 413 46 00 Av. José Gomes Ferreira Fax: +351 21 413 46 09 nº 13 1495-139 ALGÉS