Thread: Problem with collector statistic

Problem with collector statistic

From

Marco Aurélio V. da Silva

Date:

19 December 2007, 12:43:23

Dears,

Running a query on 2 database-equal, with the same indices, but with
slightly different data, I have a very different result of performance. In
one the index is used to make the join, in another it is seqscan.
The next consultation with explain.

Select a.CodPagador,b.Descricao
From Frete01 a Left Outer Join Frete02 b On (a.CodPagador = b.CodCliente)

explain in database bad:
"Merge Left Join  (cost=46619.72..48113.50 rows=58995 width=39) (actual
time=7435.922..8359.063 rows=59161 loops=1)"
"  Merge Cond: ("outer".codpagador = "inner".codcliente)"
"  ->  Sort  (cost=24649.80..24797.29 rows=58995 width=11) (actual
time=4063.517..4171.374 rows=59161 loops=1)"
"        Sort Key: a.codpagador"
"        ->  Seq Scan on frete01 a  (cost=0.00..19974.95 rows=58995
width=11) (actual time=9.007..3223.000 rows=59161 loops=1)"
"  ->  Sort  (cost=21969.91..22274.37 rows=121784 width=39) (actual
time=3372.375..3674.563 rows=169818 loops=1)"
"        Sort Key: b.codcliente"
"        ->  Seq Scan on frete02 b  (cost=0.00..8127.84 rows=121784
width=39) (actual time=17.330..1608.599 rows=121740 loops=1)"
"Total runtime: 8448.797 ms"



explain in good dabatase:
"Limit  (cost=0.00..46856.62 rows=55204 width=39) (actual
time=11.205..536.761 rows=55204 loops=1)"
"  ->  Nested Loop Left Join  (cost=0.00..295237.26 rows=55204 width=39)
(actual time=11.201..508.936 rows=55204 loops=1)"
"        ->  Seq Scan on frete01 a  (cost=0.00..19974.95 rows=55204
width=11) (actual time=11.057..220.711 rows=55204 loops=1)"
"        ->  Index Scan using frete02_f021 on frete02 b  (cost=0.00..4.65
rows=1 width=39) (actual time=0.021..0.024 rows=1 loops=9363)"
"              Index Cond: ("outer".codpagador = b.codcliente)"
"Total runtime: 551.331 ms"


Thanks to advance for your´s attention,

Marco Aurélio V. da Silva
Prodata Inf. e Cad. Ltda.
MSN: marco@prodatanet.com.br
Fone: (33) 3322-3082

Re: Problem with collector statistic

From

Tom Lane

Date:

19 December 2007, 13:09:33

"=?iso-8859-1?Q?Marco_Aur=E9lio_V._da_Silva?=" <marcoprodata@gmail.com> writes:
> Running a query on 2 database-equal, with the same indices, but with
> slightly different data, I have a very different result of performance. In
> one the index is used to make the join, in another it is seqscan.

The fact that one explain includes a Limit step, and the other does not,
says that you are not really issuing the same query in both cases.

            regards, tom lane

Re: Problem with collector statistic

From

Gregory Stark

Date:

19 December 2007, 13:23:03

Marco Aurélio V. da Silva <marcoprodata@gmail.com> writes:

> Select a.CodPagador,b.Descricao
> From Frete01 a Left Outer Join Frete02 b On (a.CodPagador = b.CodCliente)
>
> explain in good dabatase:
> "Limit  (cost=0.00..46856.62 rows=55204 width=39) (actual
> time=11.205..536.761 rows=55204 loops=1)"

That doesn't match the query you describe. Notably there's no LIMIT in the
query (or the other plan)


--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

Re: Problem with collector statistic

From

Marco Aurélio V. da Silva

Date:

19 December 2007, 15:06:30

Sorry,

I discovered that the problem appears to be another, doing tests with the
same database, this is a consultation limit using the same database and with
9364 records of the problem, and with 9363 not of.
Next

bad query with bad results:
explain analyse Select a.CodPagador,b.Descricao
From Frete01 a Left Outer Join Frete02 b On (a.CodPagador = b.CodCliente)
limit 9364

"Limit  (cost=46619.72..46856.82 rows=9364 width=39) (actual
time=3442.510..3561.769 rows=9364 loops=1)"
"  ->  Merge Left Join  (cost=46619.72..48113.50 rows=58995 width=39)
(actual time=3442.505..3535.236 rows=9364 loops=1)"
"        Merge Cond: ("outer".codpagador = "inner".codcliente)"
"        ->  Sort  (cost=24649.80..24797.29 rows=58995 width=11) (actual
time=1347.896..1364.993 rows=9364 loops=1)"
"              Sort Key: a.codpagador"
"              ->  Seq Scan on frete01 a  (cost=0.00..19974.95 rows=58995
width=11) (actual time=9.001..558.582 rows=59161 loops=1)"
"        ->  Sort  (cost=21969.91..22274.37 rows=121784 width=39) (actual
time=2094.581..2115.666 rows=11976 loops=1)"
"              Sort Key: b.codcliente"
"              ->  Seq Scan on frete02 b  (cost=0.00..8127.84 rows=121784
width=39) (actual time=2.149..383.775 rows=121740 loops=1)"
"Total runtime: 3580.588 ms"



good query with good results:
explain analyse Select a.CodPagador,b.Descricao
From Frete01 a Left Outer Join Frete02 b On (a.CodPagador = b.CodCliente)
limit 9363

"Limit  (cost=0.00..46856.62 rows=9363 width=39) (actual
time=50.652..1864.420 rows=9363 loops=1)"
"  ->  Nested Loop Left Join  (cost=0.00..295237.26 rows=58995 width=39)
(actual time=50.647..1836.360 rows=9363 loops=1)"
"        ->  Seq Scan on frete01 a  (cost=0.00..19974.95 rows=58995
width=11) (actual time=8.948..215.558 rows=9363 loops=1)"
"        ->  Index Scan using frete02_f021 on frete02 b  (cost=0.00..4.65
rows=1 width=39) (actual time=0.163..0.166 rows=1 loops=9363)"
"              Index Cond: ("outer".codpagador = b.codcliente)"
"Total runtime: 1879.041 ms"


Thanks for your´s attention.


Marco Aurélio V. da Silva
Prodata Inf. e Cad. Ltda.
MSN: marco@prodatanet.com.br
Fone: (33) 3322-3082
----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Marco Aurélio V. da Silva" <marcoprodata@gmail.com>
Cc: <pgsql-general@postgresql.org>
Sent: Wednesday, December 19, 2007 2:09 PM
Subject: Re: [GENERAL] Problem with collector statistic


> "=?iso-8859-1?Q?Marco_Aur=E9lio_V._da_Silva?=" <marcoprodata@gmail.com>
> writes:
>> Running a query on 2 database-equal, with the same indices, but with
>> slightly different data, I have a very different result of performance.
>> In
>> one the index is used to make the join, in another it is seqscan.
>
> The fact that one explain includes a Limit step, and the other does not,
> says that you are not really issuing the same query in both cases.
>
> regards, tom lane

Re: Problem with collector statistic

From

"Scott Marlowe"

Date:

19 December 2007, 15:16:56

On Dec 19, 2007 2:06 PM, Marco Aurélio V.  da Silva
<marcoprodata@gmail.com> wrote:
> Sorry,
>
> I discovered that the problem appears to be another, doing tests with the
> same database, this is a consultation limit using the same database and with
> 9364 records of the problem, and with 9363 not of.

Sounds like pgsql is switching to a sequential scan a little too
quickly for your dataset / server / etc...

I'd say look at increasing effective_cache_size and lowering random_page_cost

Re: Problem with collector statistic

From

Marco Aurélio V. da Silva

Date:

20 December 2007, 06:16:12

Dear,

Thanks, this solved my problem. I was using very low values for
effective_cache_size.

Thanks for your attention,

Marco Aurélio V. da Silva
Prodata Inf. e Cad. Ltda.
MSN: marco@prodatanet.com.br
Fone: (33) 3322-3082
----- Original Message -----
From: "Scott Marlowe" <scott.marlowe@gmail.com>
To: "Marco Aurélio V. da Silva" <marcoprodata@gmail.com>
Cc: "Tom Lane" <tgl@sss.pgh.pa.us>; <pgsql-general@postgresql.org>
Sent: Wednesday, December 19, 2007 4:16 PM
Subject: Re: [GENERAL] Problem with collector statistic


On Dec 19, 2007 2:06 PM, Marco Aurélio V.  da Silva
<marcoprodata@gmail.com> wrote:
> Sorry,
>
> I discovered that the problem appears to be another, doing tests with the
> same database, this is a consultation limit using the same database and
> with
> 9364 records of the problem, and with 9363 not of.

Sounds like pgsql is switching to a sequential scan a little too
quickly for your dataset / server / etc...

I'd say look at increasing effective_cache_size and lowering
random_page_cost