Thread: Query too slow

Query too slow

From

Rhaoni Chiu Pereira

Date:

25 August 2003, 12:04:02

Hi List,

      I have posted a subjetc on the admin list but I thought that it might fit
better on this list as follow:

Hi List,

     As I said before, I'm not a DBA " yet" , but I'm learning ... and I
already have a PostgreSQL running, so I have to ask some help...
      I got a SQL as folows :

      SELECT /*+  */
ftnfco00.estado_cliente ,
ftcofi00.grupo_faturamento ,
SUM( DECODE( ftcofi00.atual_fatura, '-', -(NVL
(ftnfpr00.qtde_duzias,0)),       '+', NVL(ftnfpr00.qtde_duzias,0),       0) ) ,
SUM( DECODE( ftcofi00.atual_fatura, '-', -(NVL(ftnfpr00.vlr_liquido,0)), '+',
NVL(ftnfpr00.vlr_liquido,0), 0) ) ,
ftprod00.tipo_cadastro||ftprod00.codigo_produto||'||'||gsames00.ano_mes ,
ftprod00.descricao_produto||'||'||gsames00.descricao ,
DIVIDE( SUM( DECODE( ftcofi00.atual_fatura, '-', -(NVL
(ftnfpr00.vlr_liquido,0)), '+', NVL(ftnfpr00.vlr_liquido,0), 0)
*ftnfpr00.margem_comercial ),
                         SUM( DECODE( ftcofi00.atual_fatura, '-', -(NVL
(ftnfpr00.vlr_liquido,0)), '+', NVL(ftnfpr00.vlr_liquido,0), 0)) ) ,
SUM( DECODE( ftcofi00.nf_prodgratis, 'S', NVL(ftnfpr00.qtde_duzias,0), 0 ) ) ,
SUM( DECODE( ftcofi00.nf_prodgratis, 'S', NVL(ftnfpr00.vlr_liquido,0), 0 ) )
FROM
ftprod00 ,
ftnfco00 ,
ftcgma00 ,
ftcgca00 ,
ftspro00 ,
ftclcr00 ,
gsames00 ,
ftcofi00 ,
ftrepr00 ,
gsesta00 ,
ftsupv00 ,
ftgrep00 ,
ftclgr00 ,
ftband00 ,
fttcli00 ,
ftredc00 ,
ftnfpr00
WHERE
ftnfco00.emp = 909 AND
ftnfpr00.fil IN ('101') AND
ftnfco00.situacao_nf = 'N' AND
TO_CHAR(ftnfco00.data_emissao,'YYYYMM') >= '200208' AND
TO_CHAR(ftnfco00.data_emissao,'YYYYMM') <= '200304' AND
ftcofi00.grupo_faturamento >= '01' AND
(ftcofi00.atual_fatura IN ('+','-') OR ftcofi00.nf_prodgratis = 'S') AND
ftcgma00.emp    = ftprod00.emp AND
ftcgma00.fil    = ftprod00.fil AND
ftcgma00.codigo = ftprod00.cla_marca AND
ftcgca00.emp    = ftprod00.emp AND
ftcgca00.fil    = ftprod00.fil AND
ftcgca00.codigo = ftprod00.cla_categoria AND
ftspro00.emp    = ftprod00.emp AND
ftspro00.fil    = ftprod00.fil AND
ftspro00.codigo = ftprod00.situacao AND
ftclcr00.emp           = ftnfco00.emp AND
ftclcr00.fil           = ftnfco00.empfil AND
ftclcr00.tipo_cadastro = ftnfco00.tipo_cad_clicre AND
ftclcr00.codigo        = ftnfco00.cod_cliente AND
gsames00.ano_mes       = TO_CHAR(ftnfco00.data_emissao,'YYYYMM') AND
ftcofi00.emp           = ftnfco00.emp AND
ftcofi00.fil           = ftnfco00.empfil AND
ftcofi00.codigo_fiscal = ftnfco00.cod_fiscal AND
ftrepr00.emp           = ftnfco00.emp AND
ftrepr00.fil           = ftnfco00.empfil AND
ftrepr00.codigo_repr   = ftnfco00.cod_repres AND
gsesta00.estado_sigla  = ftnfco00.estado_cliente AND
ftsupv00.emp               = ftrepr00.emp AND
ftsupv00.fil               = ftrepr00.fil AND
ftsupv00.codigo_supervisor = ftrepr00.codigo_supervisor AND
ftgrep00.emp               = ftrepr00.emp AND
ftgrep00.fil               = ftrepr00.fil AND
ftgrep00.codigo_grupo_rep  = ftrepr00.codigo_grupo_rep AND
ftclgr00.emp               = ftclcr00.emp AND
ftclgr00.fil               = ftclcr00.fil AND
ftclgr00.codigo            = ftclcr00.codigo_grupo_cliente AND
ftband00.emp               = ftclcr00.emp AND
ftband00.fil               = ftclcr00.fil AND
ftband00.codigo            = ftclcr00.bandeira_cliente AND
fttcli00.emp               = ftclcr00.emp AND
fttcli00.fil               = ftclcr00.fil AND
fttcli00.cod_tipocliente   = ftclcr00.codigo_tipo_cliente AND
ftredc00.emp               = ftclcr00.emp AND
ftredc00.fil               = ftclcr00.fil AND
ftredc00.tipo_contribuinte = ftclcr00.tipo_contribuinte AND
ftredc00.codigo_rede       = ftclcr00.codigo_rede AND
gsesta00.estado_sigla      = ftclcr00.emp_estado AND
ftnfco00.emp           = ftnfpr00.emp AND
ftnfco00.fil           = ftnfpr00.fil AND
ftnfco00.nota_fiscal   = ftnfpr00.nota_fiscal AND
ftnfco00.serie         = ftnfpr00.serie AND
ftnfco00.data_emissao  = ftnfpr00.data_emissao AND
ftprod00.emp           = ftnfpr00.emp AND
ftprod00.fil           = ftnfpr00.empfil AND
ftprod00.tipo_cadastro = ftnfpr00.tipo_cad_promat AND
ftprod00.codigo_produto= ftnfpr00.cod_produto
GROUP BY
ftnfco00.estado_cliente ,
ftcofi00.grupo_faturamento ,
ftprod00.tipo_cadastro||ftprod00.codigo_produto||'||'||gsames00.ano_mes ,
ftprod00.descricao_produto||'||'||gsames00.descricao


I have created the decode, NVL and DIVIDE functions.... the problem is that the
where condition makes this query to slow ( about 4 min ) and the same query in
my Oracle database takes less than 40 seconds. I have tried to isolate the
problem taking off some  fields and I left justa the two first fields in the
query ( ftnfco00.estado_cliente , ftcofi00.grupo_faturamento ) and it still
taking  almost 4 min to return. Does anyone have a hint to give me to make it
faster ?

    Atached goes a explain analyze return os this query.




Atenciosamente,

Rhaoni Chiu Pereira
Sistêmica Computadores

Visite-nos na Web: http://sistemica.info
Fone/Fax : +55 51 3328 1122

Attachment

explain.sql

Re: Query too slow

From

Stephan Szabo

Date:

25 August 2003, 12:45:46

On Mon, 25 Aug 2003, Rhaoni Chiu Pereira wrote:

> Hi List,
>
>      As I said before, I'm not a DBA " yet" , but I'm learning ... and I
> already have a PostgreSQL running, so I have to ask some help...
>       I got a SQL as folows :

...

Looking at the explain:

It's choosing lots of nested loops because it's expecting a small number
of rows to be returned at each step but in reality there are alot of rows
so that's may not really be a good choice.

For example the scan of ftnfco00 is expected to return 295 rows but
actually returns 9339, and it looks like it's not estimating the number of
matches between the tables very well either since the real count gets up
to 240000 in a step where the estimated rows goes to 1.

What does explain analyze give after set enable_nestloop=off;?

Re: Query too slow

From

"Shridhar Daithankar"

Date:

25 August 2003, 13:33:56

On 25 Aug 2003 at 8:44, Stephan Szabo wrote:

> On Mon, 25 Aug 2003, Rhaoni Chiu Pereira wrote:
>
> > Hi List,
> >
> >      As I said before, I'm not a DBA " yet" , but I'm learning ... and I
> > already have a PostgreSQL running, so I have to ask some help...
> >       I got a SQL as folows :
>
> ...
>
> Looking at the explain:
>
> It's choosing lots of nested loops because it's expecting a small number
> of rows to be returned at each step but in reality there are alot of rows
> so that's may not really be a good choice.
>
> For example the scan of ftnfco00 is expected to return 295 rows but
> actually returns 9339, and it looks like it's not estimating the number of
> matches between the tables very well either since the real count gets up
> to 240000 in a step where the estimated rows goes to 1.
>
> What does explain analyze give after set enable_nestloop=off;?

In addition to that if it is getting the stats wrong, does running vacuum
analyze help? If stats are updated, it should pick up proper plans, right?

Bye
 Shridhar

--
Flon's Law:    There is not now, and never will be, a language in    which it is the
least bit difficult to write bad programs.

Re: Query too slow

From

Ang Chin Han

Date:

26 August 2003, 03:46:18

Stephan Szabo wrote:

> Looking at the explain:

Veering aside a bit, since we usually pinpoint performance problems by
looking at EXPLAIN ANALYZE's differences between the planner's
estimation and actual execution's stats, what's involved in parsing the
EXPLAIN ANALYZE results, and highlighting the places where they are way
different? Bold, underline, or put some asterisks in front of those steps.

Makes looking at big EXPLAIN ANALYZE trees much easier.

--
Linux homer 2.4.18-14 #1 Wed Sep 4 13:35:50 EDT 2002 i686 i686 i386
GNU/Linux
   2:30pm  up 243 days,  5:48,  8 users,  load average: 5.52, 5.29, 5.10

Attachment

msg-22763-3244.dat

Re: Query too slow

From

Stephan Szabo

Date:

26 August 2003, 12:48:33

On Tue, 26 Aug 2003, Ang Chin Han wrote:

> Stephan Szabo wrote:
>
> > Looking at the explain:
>
> Veering aside a bit, since we usually pinpoint performance problems by
> looking at EXPLAIN ANALYZE's differences between the planner's
> estimation and actual execution's stats, what's involved in parsing the
> EXPLAIN ANALYZE results, and highlighting the places where they are way
> different? Bold, underline, or put some asterisks in front of those steps.

The hardest part is determining where it matters I think.  You can use the
row counts as the base for that, but going from 1 row to 50 is not
necessarily going to be an issue, but it might be if a nested loop is
chosen.

Re: Query too slow

From

Tom Lane

Date:

26 August 2003, 13:02:32

Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> On Tue, 26 Aug 2003, Ang Chin Han wrote:
>> Veering aside a bit, since we usually pinpoint performance problems by
>> looking at EXPLAIN ANALYZE's differences between the planner's
>> estimation and actual execution's stats, what's involved in parsing the
>> EXPLAIN ANALYZE results, and highlighting the places where they are way
>> different? Bold, underline, or put some asterisks in front of those steps.

> The hardest part is determining where it matters I think.  You can use the
> row counts as the base for that, but going from 1 row to 50 is not
> necessarily going to be an issue, but it might be if a nested loop is
> chosen.

We've been chatting about this idea among the Red Hat group.  The RHDB
Visual Explain tool (get it at http://sources.redhat.com/rhdb/) already
computes the percent of total runtime represented by each plan node.
It seems like we could highlight nodes based on a large difference
between estimated and actual percentage, or just highlight the nodes
that are more than X percent of the runtime.

            regards, tom lane

Re: Query too slow

From

Bruce Momjian

Date:

05 September 2003, 01:34:59

Tom Lane wrote:
> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > On Tue, 26 Aug 2003, Ang Chin Han wrote:
> >> Veering aside a bit, since we usually pinpoint performance problems by
> >> looking at EXPLAIN ANALYZE's differences between the planner's
> >> estimation and actual execution's stats, what's involved in parsing the
> >> EXPLAIN ANALYZE results, and highlighting the places where they are way
> >> different? Bold, underline, or put some asterisks in front of those steps.
>
> > The hardest part is determining where it matters I think.  You can use the
> > row counts as the base for that, but going from 1 row to 50 is not
> > necessarily going to be an issue, but it might be if a nested loop is
> > chosen.
>
> We've been chatting about this idea among the Red Hat group.  The RHDB
> Visual Explain tool (get it at http://sources.redhat.com/rhdb/) already
> computes the percent of total runtime represented by each plan node.
> It seems like we could highlight nodes based on a large difference
> between estimated and actual percentage, or just highlight the nodes
> that are more than X percent of the runtime.

Is there a TODO here?  Perhaps:

    o Have EXPLAIN ANALYZE highlight poor optimizer estimates

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Query too slow

From

Bruce Momjian

Date:

10 September 2003, 17:16:11

Bruce Momjian wrote:
> Tom Lane wrote:
> > Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > > On Tue, 26 Aug 2003, Ang Chin Han wrote:
> > >> Veering aside a bit, since we usually pinpoint performance problems by
> > >> looking at EXPLAIN ANALYZE's differences between the planner's
> > >> estimation and actual execution's stats, what's involved in parsing the
> > >> EXPLAIN ANALYZE results, and highlighting the places where they are way
> > >> different? Bold, underline, or put some asterisks in front of those steps.
> >
> > > The hardest part is determining where it matters I think.  You can use the
> > > row counts as the base for that, but going from 1 row to 50 is not
> > > necessarily going to be an issue, but it might be if a nested loop is
> > > chosen.
> >
> > We've been chatting about this idea among the Red Hat group.  The RHDB
> > Visual Explain tool (get it at http://sources.redhat.com/rhdb/) already
> > computes the percent of total runtime represented by each plan node.
> > It seems like we could highlight nodes based on a large difference
> > between estimated and actual percentage, or just highlight the nodes
> > that are more than X percent of the runtime.
>
> Is there a TODO here?  Perhaps:
>
>     o Have EXPLAIN ANALYZE highlight poor optimizer estimates

No one commented, so I had to guess --- I added it.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073