Thread: Query too slow
Hi List, I have posted a subjetc on the admin list but I thought that it might fit better on this list as follow: Hi List, As I said before, I'm not a DBA " yet" , but I'm learning ... and I already have a PostgreSQL running, so I have to ask some help... I got a SQL as folows : SELECT /*+ */ ftnfco00.estado_cliente , ftcofi00.grupo_faturamento , SUM( DECODE( ftcofi00.atual_fatura, '-', -(NVL (ftnfpr00.qtde_duzias,0)), '+', NVL(ftnfpr00.qtde_duzias,0), 0) ) , SUM( DECODE( ftcofi00.atual_fatura, '-', -(NVL(ftnfpr00.vlr_liquido,0)), '+', NVL(ftnfpr00.vlr_liquido,0), 0) ) , ftprod00.tipo_cadastro||ftprod00.codigo_produto||'||'||gsames00.ano_mes , ftprod00.descricao_produto||'||'||gsames00.descricao , DIVIDE( SUM( DECODE( ftcofi00.atual_fatura, '-', -(NVL (ftnfpr00.vlr_liquido,0)), '+', NVL(ftnfpr00.vlr_liquido,0), 0) *ftnfpr00.margem_comercial ), SUM( DECODE( ftcofi00.atual_fatura, '-', -(NVL (ftnfpr00.vlr_liquido,0)), '+', NVL(ftnfpr00.vlr_liquido,0), 0)) ) , SUM( DECODE( ftcofi00.nf_prodgratis, 'S', NVL(ftnfpr00.qtde_duzias,0), 0 ) ) , SUM( DECODE( ftcofi00.nf_prodgratis, 'S', NVL(ftnfpr00.vlr_liquido,0), 0 ) ) FROM ftprod00 , ftnfco00 , ftcgma00 , ftcgca00 , ftspro00 , ftclcr00 , gsames00 , ftcofi00 , ftrepr00 , gsesta00 , ftsupv00 , ftgrep00 , ftclgr00 , ftband00 , fttcli00 , ftredc00 , ftnfpr00 WHERE ftnfco00.emp = 909 AND ftnfpr00.fil IN ('101') AND ftnfco00.situacao_nf = 'N' AND TO_CHAR(ftnfco00.data_emissao,'YYYYMM') >= '200208' AND TO_CHAR(ftnfco00.data_emissao,'YYYYMM') <= '200304' AND ftcofi00.grupo_faturamento >= '01' AND (ftcofi00.atual_fatura IN ('+','-') OR ftcofi00.nf_prodgratis = 'S') AND ftcgma00.emp = ftprod00.emp AND ftcgma00.fil = ftprod00.fil AND ftcgma00.codigo = ftprod00.cla_marca AND ftcgca00.emp = ftprod00.emp AND ftcgca00.fil = ftprod00.fil AND ftcgca00.codigo = ftprod00.cla_categoria AND ftspro00.emp = ftprod00.emp AND ftspro00.fil = ftprod00.fil AND ftspro00.codigo = ftprod00.situacao AND ftclcr00.emp = ftnfco00.emp AND ftclcr00.fil = ftnfco00.empfil AND ftclcr00.tipo_cadastro = ftnfco00.tipo_cad_clicre AND ftclcr00.codigo = ftnfco00.cod_cliente AND gsames00.ano_mes = TO_CHAR(ftnfco00.data_emissao,'YYYYMM') AND ftcofi00.emp = ftnfco00.emp AND ftcofi00.fil = ftnfco00.empfil AND ftcofi00.codigo_fiscal = ftnfco00.cod_fiscal AND ftrepr00.emp = ftnfco00.emp AND ftrepr00.fil = ftnfco00.empfil AND ftrepr00.codigo_repr = ftnfco00.cod_repres AND gsesta00.estado_sigla = ftnfco00.estado_cliente AND ftsupv00.emp = ftrepr00.emp AND ftsupv00.fil = ftrepr00.fil AND ftsupv00.codigo_supervisor = ftrepr00.codigo_supervisor AND ftgrep00.emp = ftrepr00.emp AND ftgrep00.fil = ftrepr00.fil AND ftgrep00.codigo_grupo_rep = ftrepr00.codigo_grupo_rep AND ftclgr00.emp = ftclcr00.emp AND ftclgr00.fil = ftclcr00.fil AND ftclgr00.codigo = ftclcr00.codigo_grupo_cliente AND ftband00.emp = ftclcr00.emp AND ftband00.fil = ftclcr00.fil AND ftband00.codigo = ftclcr00.bandeira_cliente AND fttcli00.emp = ftclcr00.emp AND fttcli00.fil = ftclcr00.fil AND fttcli00.cod_tipocliente = ftclcr00.codigo_tipo_cliente AND ftredc00.emp = ftclcr00.emp AND ftredc00.fil = ftclcr00.fil AND ftredc00.tipo_contribuinte = ftclcr00.tipo_contribuinte AND ftredc00.codigo_rede = ftclcr00.codigo_rede AND gsesta00.estado_sigla = ftclcr00.emp_estado AND ftnfco00.emp = ftnfpr00.emp AND ftnfco00.fil = ftnfpr00.fil AND ftnfco00.nota_fiscal = ftnfpr00.nota_fiscal AND ftnfco00.serie = ftnfpr00.serie AND ftnfco00.data_emissao = ftnfpr00.data_emissao AND ftprod00.emp = ftnfpr00.emp AND ftprod00.fil = ftnfpr00.empfil AND ftprod00.tipo_cadastro = ftnfpr00.tipo_cad_promat AND ftprod00.codigo_produto= ftnfpr00.cod_produto GROUP BY ftnfco00.estado_cliente , ftcofi00.grupo_faturamento , ftprod00.tipo_cadastro||ftprod00.codigo_produto||'||'||gsames00.ano_mes , ftprod00.descricao_produto||'||'||gsames00.descricao I have created the decode, NVL and DIVIDE functions.... the problem is that the where condition makes this query to slow ( about 4 min ) and the same query in my Oracle database takes less than 40 seconds. I have tried to isolate the problem taking off some fields and I left justa the two first fields in the query ( ftnfco00.estado_cliente , ftcofi00.grupo_faturamento ) and it still taking almost 4 min to return. Does anyone have a hint to give me to make it faster ? Atached goes a explain analyze return os this query. Atenciosamente, Rhaoni Chiu Pereira Sistêmica Computadores Visite-nos na Web: http://sistemica.info Fone/Fax : +55 51 3328 1122
Attachment
On Mon, 25 Aug 2003, Rhaoni Chiu Pereira wrote: > Hi List, > > As I said before, I'm not a DBA " yet" , but I'm learning ... and I > already have a PostgreSQL running, so I have to ask some help... > I got a SQL as folows : ... Looking at the explain: It's choosing lots of nested loops because it's expecting a small number of rows to be returned at each step but in reality there are alot of rows so that's may not really be a good choice. For example the scan of ftnfco00 is expected to return 295 rows but actually returns 9339, and it looks like it's not estimating the number of matches between the tables very well either since the real count gets up to 240000 in a step where the estimated rows goes to 1. What does explain analyze give after set enable_nestloop=off;?
On 25 Aug 2003 at 8:44, Stephan Szabo wrote: > On Mon, 25 Aug 2003, Rhaoni Chiu Pereira wrote: > > > Hi List, > > > > As I said before, I'm not a DBA " yet" , but I'm learning ... and I > > already have a PostgreSQL running, so I have to ask some help... > > I got a SQL as folows : > > ... > > Looking at the explain: > > It's choosing lots of nested loops because it's expecting a small number > of rows to be returned at each step but in reality there are alot of rows > so that's may not really be a good choice. > > For example the scan of ftnfco00 is expected to return 295 rows but > actually returns 9339, and it looks like it's not estimating the number of > matches between the tables very well either since the real count gets up > to 240000 in a step where the estimated rows goes to 1. > > What does explain analyze give after set enable_nestloop=off;? In addition to that if it is getting the stats wrong, does running vacuum analyze help? If stats are updated, it should pick up proper plans, right? Bye Shridhar -- Flon's Law: There is not now, and never will be, a language in which it is the least bit difficult to write bad programs.
Stephan Szabo wrote: > Looking at the explain: Veering aside a bit, since we usually pinpoint performance problems by looking at EXPLAIN ANALYZE's differences between the planner's estimation and actual execution's stats, what's involved in parsing the EXPLAIN ANALYZE results, and highlighting the places where they are way different? Bold, underline, or put some asterisks in front of those steps. Makes looking at big EXPLAIN ANALYZE trees much easier. -- Linux homer 2.4.18-14 #1 Wed Sep 4 13:35:50 EDT 2002 i686 i686 i386 GNU/Linux 2:30pm up 243 days, 5:48, 8 users, load average: 5.52, 5.29, 5.10
Attachment
On Tue, 26 Aug 2003, Ang Chin Han wrote: > Stephan Szabo wrote: > > > Looking at the explain: > > Veering aside a bit, since we usually pinpoint performance problems by > looking at EXPLAIN ANALYZE's differences between the planner's > estimation and actual execution's stats, what's involved in parsing the > EXPLAIN ANALYZE results, and highlighting the places where they are way > different? Bold, underline, or put some asterisks in front of those steps. The hardest part is determining where it matters I think. You can use the row counts as the base for that, but going from 1 row to 50 is not necessarily going to be an issue, but it might be if a nested loop is chosen.
Stephan Szabo <sszabo@megazone.bigpanda.com> writes: > On Tue, 26 Aug 2003, Ang Chin Han wrote: >> Veering aside a bit, since we usually pinpoint performance problems by >> looking at EXPLAIN ANALYZE's differences between the planner's >> estimation and actual execution's stats, what's involved in parsing the >> EXPLAIN ANALYZE results, and highlighting the places where they are way >> different? Bold, underline, or put some asterisks in front of those steps. > The hardest part is determining where it matters I think. You can use the > row counts as the base for that, but going from 1 row to 50 is not > necessarily going to be an issue, but it might be if a nested loop is > chosen. We've been chatting about this idea among the Red Hat group. The RHDB Visual Explain tool (get it at http://sources.redhat.com/rhdb/) already computes the percent of total runtime represented by each plan node. It seems like we could highlight nodes based on a large difference between estimated and actual percentage, or just highlight the nodes that are more than X percent of the runtime. regards, tom lane
Tom Lane wrote: > Stephan Szabo <sszabo@megazone.bigpanda.com> writes: > > On Tue, 26 Aug 2003, Ang Chin Han wrote: > >> Veering aside a bit, since we usually pinpoint performance problems by > >> looking at EXPLAIN ANALYZE's differences between the planner's > >> estimation and actual execution's stats, what's involved in parsing the > >> EXPLAIN ANALYZE results, and highlighting the places where they are way > >> different? Bold, underline, or put some asterisks in front of those steps. > > > The hardest part is determining where it matters I think. You can use the > > row counts as the base for that, but going from 1 row to 50 is not > > necessarily going to be an issue, but it might be if a nested loop is > > chosen. > > We've been chatting about this idea among the Red Hat group. The RHDB > Visual Explain tool (get it at http://sources.redhat.com/rhdb/) already > computes the percent of total runtime represented by each plan node. > It seems like we could highlight nodes based on a large difference > between estimated and actual percentage, or just highlight the nodes > that are more than X percent of the runtime. Is there a TODO here? Perhaps: o Have EXPLAIN ANALYZE highlight poor optimizer estimates -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian wrote: > Tom Lane wrote: > > Stephan Szabo <sszabo@megazone.bigpanda.com> writes: > > > On Tue, 26 Aug 2003, Ang Chin Han wrote: > > >> Veering aside a bit, since we usually pinpoint performance problems by > > >> looking at EXPLAIN ANALYZE's differences between the planner's > > >> estimation and actual execution's stats, what's involved in parsing the > > >> EXPLAIN ANALYZE results, and highlighting the places where they are way > > >> different? Bold, underline, or put some asterisks in front of those steps. > > > > > The hardest part is determining where it matters I think. You can use the > > > row counts as the base for that, but going from 1 row to 50 is not > > > necessarily going to be an issue, but it might be if a nested loop is > > > chosen. > > > > We've been chatting about this idea among the Red Hat group. The RHDB > > Visual Explain tool (get it at http://sources.redhat.com/rhdb/) already > > computes the percent of total runtime represented by each plan node. > > It seems like we could highlight nodes based on a large difference > > between estimated and actual percentage, or just highlight the nodes > > that are more than X percent of the runtime. > > Is there a TODO here? Perhaps: > > o Have EXPLAIN ANALYZE highlight poor optimizer estimates No one commented, so I had to guess --- I added it. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073