Thread: Bad performace of a query

Bad performace of a query

From

"Rafa Comino"

Date:

27 February 2007, 09:29:13

I have this query:

SELECT DISTINCT isbn, CURRENT_TIMESTAMP, 1
              FROM librosdisponibilidadtemp
             WHERE proceso = ai_proceso
               AND gen_isbn_pais(isbn) IN (SELECT pais FROM raizpaises)
               AND NOT EXISTS
                   ( SELECT isbn
                       FROM libros
                      WHERE isbn = librosdisponibilidadtemp.isbn)
               AND NOT EXISTS
                   ( SELECT isbn
                       FROM isbns_a_descubrir
                      WHERE isbn = librosdisponibilidadtemp.isbn);

and the plan execution is
Unique (cost=133558107.45..133558128.13 rows=414 width=21) (actual time= 790552.899..790553.098 rows=9 loops=1)
-> Sort (cost=133558107.45..133558112.62 rows=2068 width=21) (actual time=790552.882..790552.944 rows=9 loops=1)
        Sort Key: isbn, now(), 1
        -> Index Scan using librosdisponibilidadtemp_idx_proceso on librosdisponibilidadtemp (cost= 1.01..133557993.56 rows=2068 width=21) (actual time=5722.607..790552.588 rows=9 loops=1)
              Index Cond: (proceso = 28465)
              Filter: ((hashed subplan) AND (NOT (subplan)) AND (NOT (subplan)))
              SubPlan
                -> Seq Scan on isbns_a_descubrir (cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1 loops=2025)
                      Filter: ((isbn)::bpchar = $1)
                -> Index Scan using "libros_idx_ISBN" on libros (cost=0.00..5.95 rows=1 width=21) (actual time=12.938..12.938 rows=1 loops=50512)
                      Index Cond: (isbn = $1)
                -> Seq Scan on raizpaises (cost= 0.00..1.01 rows=1 width=10) (actual time=0.764..0.871 rows=1 loops=1)
Total runtime: 790553.561 ms

The libros table has 1200000 regs.
The isbns_a_descubrir table has 300000 regs.
The librosdisponibilidadtemp table has 50000 regs.

does anybody can explain me, why using index ibrosdisponibilidadtemp_idx_proceso is so slow and the others conditions are good enough
Thanks everybody

Re: Bad performace of a query

From

Kaloyan Iliev

Date:

28 February 2007, 04:48:52

<meta content="text/html;charset=windows-1251"
 http-equiv="Content-Type">


<body alink="#ee0000" bgcolor="#ffffff" link="#0000ee" text="#000000"
 vlink="#551a8b">




Hi,

The index doesn't cost you so much, seq SEQ Scan actully does:
 Seq Scan on isbns_a_descubrir 
(cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1
loops=2025)

This seq scan is called once for every row of librosdisponibilidadtemp
which passes the WHERE condition.
So Here "Index Scan using
librosdisponibilidadtemp_idx_proceso on librosdisponibilidadtemp 
(cost=
1.01..133557993.56 rows=2068 width=21) (actual
time=5722.607..790552.588 rows=9 loops=1)" it
says how much it will cost you to calculate the upper seq scan and the
seq scan on (Seq Scan on
raizpaises) and the index scan on libros.

I suggest you to create index on table isbns_a_descubrir over
column isbn. This will hurry the query.
And use join instead of IN for table raizpaises. This should also save
some time.

Regards,
   Kaloyan Iliev

<blockquote
 cite="midbd8b58a40702270528i7c395029v4a8bf3eb442bc8c9@mail.gmail.com"
 type="cite">ave this query:

SELECT DISTINCT isbn, CURRENT_TIMESTAMP, 1
              FROM librosdisponibilidadtemp
             WHERE proceso = ai_proceso
               AND gen_isbn_pais(isbn) IN (SELECT pais FROM raizpaises)

               AND NOT EXISTS
                   ( SELECT isbn
                       FROM libros
                      WHERE isbn = librosdisponibilidadtemp.isbn)
               AND NOT EXISTS
                   ( SELECT isbn
                       FROM isbns_a_descubrir
                      WHERE isbn = librosdisponibilidadtemp.isbn);

and the plan execution is
Unique  (cost=133558107.45..133558128.13 rows=414 width=21) (actual
time=
790552.899..790553.098 rows=9 loops=1)
  ->  Sort  (cost=133558107.45..133558112.62 rows=2068 width=21)
(actual time=790552.882..790552.944 rows=9 loops=1)
        Sort Key: isbn, now(), 1
        ->  Index Scan using librosdisponibilidadtemp_idx_proceso on
librosdisponibilidadtemp  (cost=
1.01..133557993.56 rows=2068 width=21) (actual
time=5722.607..790552.588 rows=9 loops=1)
              Index Cond: (proceso = 28465)
              Filter: ((hashed subplan) AND (NOT (subplan)) AND (NOT
(subplan)))
              SubPlan
                ->  Seq Scan on isbns_a_descubrir 
(cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1
loops=2025)
                      Filter: ((isbn)::bpchar = $1)
                ->  Index Scan using "libros_idx_ISBN" on libros 
(cost=0.00..5.95 rows=1 width=21) (actual time=12.938..12.938 rows=1
loops=50512)
                      Index Cond: (isbn = $1)
                ->  Seq Scan on raizpaises  (cost=
0.00..1.01 rows=1 width=10) (actual time=0.764..0.871 rows=1 loops=1)
Total runtime: 790553.561 ms

The libros table has 1200000 regs.
The isbns_a_descubrir table has 300000 regs.
The librosdisponibilidadtemp table has 50000 regs.

does anybody can explain me, why using index
ibrosdisponibilidadtemp_idx_proceso is so slow and the others
conditions are good enough
Thanks everybody

Re: Bad performace of a query

From

Kaloyan Iliev

Date:

01 March 2007, 11:49:23

<meta content="text/html;charset=windows-1251"
 http-equiv="Content-Type">


<body alink="#ee0000" bgcolor="#ffffff" link="#0000ee" text="#000000"
 vlink="#551a8b">
Hi,

The index doesn't cost you so much, seq SEQ Scan actully does:
 Seq Scan on isbns_a_descubrir 
(cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1
loops=2025)

This seq scan is called once for every row of librosdisponibilidadtemp
which passes the WHERE condition.
So Here "Index Scan using
librosdisponibilidadtemp_idx_proceso on librosdisponibilidadtemp 
(cost=
1.01..133557993.56 rows=2068 width=21) (actual
time=5722.607..790552.588 rows=9 loops=1)" it
says how much it will cost you to calculate the upper seq scan and the
seq scan on (Seq Scan on
raizpaises) and the index scan on libros.

I suggest you to create index on table isbns_a_descubrir over
column isbn. This will hurry the query.
And use join instead of IN for table raizpaises. This should also save
some time.

Regards,
   Kaloyan Iliev

<blockquote
 cite="midbd8b58a40702270528i7c395029v4a8bf3eb442bc8c9@mail.gmail.com"
 type="cite">ave this query:

SELECT DISTINCT isbn, CURRENT_TIMESTAMP, 1
              FROM librosdisponibilidadtemp
             WHERE proceso = ai_proceso
               AND gen_isbn_pais(isbn) IN (SELECT pais FROM raizpaises)

               AND NOT EXISTS
                   ( SELECT isbn
                       FROM libros
                      WHERE isbn = librosdisponibilidadtemp.isbn)
               AND NOT EXISTS
                   ( SELECT isbn

                       FROM isbns_a_descubrir
                      WHERE isbn = librosdisponibilidadtemp.isbn);

and the plan execution is
Unique  (cost=133558107.45..133558128.13 rows=414 width=21) (actual
time=
790552.899..790553.098 rows=9 loops=1)
  ->  Sort  (cost=133558107.45..133558112.62 rows=2068 width=21)
(actual time=790552.882..790552.944 rows=9 loops=1)
        Sort Key: isbn, now(), 1
        ->  Index Scan using librosdisponibilidadtemp_idx_proceso on
librosdisponibilidadtemp  (cost=
1.01..133557993.56 rows=2068 width=21) (actual
time=5722.607..790552.588 rows=9 loops=1)
              Index Cond: (proceso = 28465)
              Filter: ((hashed subplan) AND (NOT (subplan)) AND (NOT
(subplan)))
              SubPlan
                ->  Seq Scan on isbns_a_descubrir 
(cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1
loops=2025)
                      Filter: ((isbn)::bpchar = $1)
                ->  Index Scan using "libros_idx_ISBN" on libros 
(cost=0.00..5.95 rows=1 width=21) (actual time=12.938..12.938 rows=1
loops=50512)
                      Index Cond: (isbn = $1)
                ->  Seq Scan on raizpaises  (cost=
0.00..1.01 rows=1 width=10) (actual time=0.764..0.871 rows=1 loops=1)
Total runtime: 790553.561 ms

The libros table has 1200000 regs.
The isbns_a_descubrir table has 300000 regs.
The librosdisponibilidadtemp table has 50000 regs.

does anybody can explain me, why using index
ibrosdisponibilidadtemp_idx_proceso is so slow and the others
conditions are good enough
Thanks everybody

Re: Bad performace of a query

From

Kaloyan Iliev

Date:

01 March 2007, 11:56:17

<meta content="text/html;charset=windows-1251"
 http-equiv="Content-Type">


<body alink="#ee0000" bgcolor="#ffffff" link="#0000ee" text="#000000"
 vlink="#551a8b">


Hi,

The index doesn't cost you so much, seq SEQ Scan actully does:
 Seq Scan on isbns_a_descubrir 
(cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1
loops=2025)

This seq scan is called once for every row of librosdisponibilidadtemp
which passes the WHERE condition.
So Here "Index Scan using
librosdisponibilidadtemp_idx_proceso on librosdisponibilidadtemp 
(cost=
1.01..133557993.56 rows=2068 width=21) (actual
time=5722.607..790552.588 rows=9 loops=1)" it
says how much it will cost you to calculate the upper seq scan and the
seq scan on (Seq Scan on
raizpaises) and the index scan on libros.

I suggest you to create index on table isbns_a_descubrir over
column isbn. This will hurry the query.
And use join instead of IN for table raizpaises. This should also save
some time.

Regards,
   Kaloyan Iliev

<blockquote
 cite="midbd8b58a40702270528i7c395029v4a8bf3eb442bc8c9@mail.gmail.com"
 type="cite">ave this query:

SELECT DISTINCT isbn, CURRENT_TIMESTAMP, 1
              FROM librosdisponibilidadtemp
             WHERE proceso = ai_proceso
               AND gen_isbn_pais(isbn) IN (SELECT pais FROM raizpaises)

               AND NOT EXISTS
                   ( SELECT isbn
                       FROM libros
                      WHERE isbn = librosdisponibilidadtemp.isbn)
               AND NOT EXISTS
                   ( SELECT isbn
                       FROM isbns_a_descubrir
                      WHERE isbn = librosdisponibilidadtemp.isbn);

and the plan execution is
Unique  (cost=133558107.45..133558128.13 rows=414 width=21) (actual
time=
790552.899..790553.098 rows=9 loops=1)
  ->  Sort  (cost=133558107.45..133558112.62 rows=2068 width=21)
(actual time=790552.882..790552.944 rows=9 loops=1)
        Sort Key: isbn, now(), 1
        ->  Index Scan using librosdisponibilidadtemp_idx_proceso on
librosdisponibilidadtemp  (cost=
1.01..133557993.56 rows=2068 width=21) (actual
time=5722.607..790552.588 rows=9 loops=1)
              Index Cond: (proceso = 28465)
              Filter: ((hashed subplan) AND (NOT (subplan)) AND (NOT
(subplan)))
              SubPlan
                ->  Seq Scan on isbns_a_descubrir 
(cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1
loops=2025)
                      Filter: ((isbn)::bpchar = $1)
                ->  Index Scan using "libros_idx_ISBN" on libros 
(cost=0.00..5.95 rows=1 width=21) (actual time=12.938..12.938 rows=1
loops=50512)
                      Index Cond: (isbn = $1)
                ->  Seq Scan on raizpaises  (cost=
0.00..1.01 rows=1 width=10) (actual time=0.764..0.871 rows=1 loops=1)
Total runtime: 790553.561 ms

The libros table has 1200000 regs.
The isbns_a_descubrir table has 300000 regs.
The librosdisponibilidadtemp table has 50000 regs.

does anybody can explain me, why using index
ibrosdisponibilidadtemp_idx_proceso is so slow and the others
conditions are good enough
Thanks everybody