Re: Bad performace of a query - Mailing list pgsql-general

From Kaloyan Iliev
Subject Re: Bad performace of a query
Date
Msg-id 45E53962.9070906@mbox.digsys.bg
Whole thread Raw
In response to Bad performace of a query  ("Rafa Comino" <rafacomino@gmail.com>)
List pgsql-general
<meta content="text/html;charset=windows-1251"
 http-equiv="Content-Type">


<body alink="#ee0000" bgcolor="#ffffff" link="#0000ee" text="#000000"
 vlink="#551a8b">


Hi,

The index doesn't cost you so much, seq SEQ Scan actully does:
 Seq Scan on isbns_a_descubrir 
(cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1
loops=2025)

This seq scan is called once for every row of librosdisponibilidadtemp
which passes the WHERE condition.
So Here "Index Scan using
librosdisponibilidadtemp_idx_proceso on librosdisponibilidadtemp 
(cost=
1.01..133557993.56 rows=2068 width=21) (actual
time=5722.607..790552.588 rows=9 loops=1)" it
says how much it will cost you to calculate the upper seq scan and the
seq scan on (Seq Scan on
raizpaises) and the index scan on libros.

I suggest you to create index on table isbns_a_descubrir over
column isbn. This will hurry the query.
And use join instead of IN for table raizpaises. This should also save
some time.

Regards,
   Kaloyan Iliev

<blockquote
 cite="midbd8b58a40702270528i7c395029v4a8bf3eb442bc8c9@mail.gmail.com"
 type="cite">ave this query:

SELECT DISTINCT isbn, CURRENT_TIMESTAMP, 1
              FROM librosdisponibilidadtemp
             WHERE proceso = ai_proceso
               AND gen_isbn_pais(isbn) IN (SELECT pais FROM raizpaises)

               AND NOT EXISTS
                   ( SELECT isbn
                       FROM libros
                      WHERE isbn = librosdisponibilidadtemp.isbn)
               AND NOT EXISTS
                   ( SELECT isbn
                       FROM isbns_a_descubrir
                      WHERE isbn = librosdisponibilidadtemp.isbn);

and the plan execution is
Unique  (cost=133558107.45..133558128.13 rows=414 width=21) (actual
time=
790552.899..790553.098 rows=9 loops=1)
  ->  Sort  (cost=133558107.45..133558112.62 rows=2068 width=21)
(actual time=790552.882..790552.944 rows=9 loops=1)
        Sort Key: isbn, now(), 1
        ->  Index Scan using librosdisponibilidadtemp_idx_proceso on
librosdisponibilidadtemp  (cost=
1.01..133557993.56 rows=2068 width=21) (actual
time=5722.607..790552.588 rows=9 loops=1)
              Index Cond: (proceso = 28465)
              Filter: ((hashed subplan) AND (NOT (subplan)) AND (NOT
(subplan)))
              SubPlan
                ->  Seq Scan on isbns_a_descubrir 
(cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1
loops=2025)
                      Filter: ((isbn)::bpchar = $1)
                ->  Index Scan using "libros_idx_ISBN" on libros 
(cost=0.00..5.95 rows=1 width=21) (actual time=12.938..12.938 rows=1
loops=50512)
                      Index Cond: (isbn = $1)
                ->  Seq Scan on raizpaises  (cost=
0.00..1.01 rows=1 width=10) (actual time=0.764..0.871 rows=1 loops=1)
Total runtime: 790553.561 ms

The libros table has 1200000 regs.
The isbns_a_descubrir table has 300000 regs.
The librosdisponibilidadtemp table has 50000 regs.

does anybody can explain me, why using index
ibrosdisponibilidadtemp_idx_proceso is so slow and the others
conditions are good enough
Thanks everybody

pgsql-general by date:

Previous
From: "Andy Dale"
Date:
Subject: Re: PostgreSQL 8.2.3, Jboss 4.0.3 and postgresql-8.2-504.jdbc4
Next
From: Brent Wood
Date:
Subject: Re: performance of partitioning?