Thread: Bad performace of a query
I have this query:
SELECT DISTINCT isbn, CURRENT_TIMESTAMP, 1
FROM librosdisponibilidadtemp
WHERE proceso = ai_proceso
AND gen_isbn_pais(isbn) IN (SELECT pais FROM raizpaises)
AND NOT EXISTS
( SELECT isbn
FROM libros
WHERE isbn = librosdisponibilidadtemp.isbn)
AND NOT EXISTS
( SELECT isbn
FROM isbns_a_descubrir
WHERE isbn = librosdisponibilidadtemp.isbn);
and the plan execution is
Unique (cost=133558107.45..133558128.13 rows=414 width=21) (actual time= 790552.899..790553.098 rows=9 loops=1)
-> Sort (cost=133558107.45..133558112.62 rows=2068 width=21) (actual time=790552.882..790552.944 rows=9 loops=1)
Sort Key: isbn, now(), 1
-> Index Scan using librosdisponibilidadtemp_idx_proceso on librosdisponibilidadtemp (cost= 1.01..133557993.56 rows=2068 width=21) (actual time=5722.607..790552.588 rows=9 loops=1)
Index Cond: (proceso = 28465)
Filter: ((hashed subplan) AND (NOT (subplan)) AND (NOT (subplan)))
SubPlan
-> Seq Scan on isbns_a_descubrir (cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1 loops=2025)
Filter: ((isbn)::bpchar = $1)
-> Index Scan using "libros_idx_ISBN" on libros (cost=0.00..5.95 rows=1 width=21) (actual time=12.938..12.938 rows=1 loops=50512)
Index Cond: (isbn = $1)
-> Seq Scan on raizpaises (cost= 0.00..1.01 rows=1 width=10) (actual time=0.764..0.871 rows=1 loops=1)
Total runtime: 790553.561 ms
The libros table has 1200000 regs.
The isbns_a_descubrir table has 300000 regs.
The librosdisponibilidadtemp table has 50000 regs.
does anybody can explain me, why using index ibrosdisponibilidadtemp_idx_proceso is so slow and the others conditions are good enough
Thanks everybody
SELECT DISTINCT isbn, CURRENT_TIMESTAMP, 1
FROM librosdisponibilidadtemp
WHERE proceso = ai_proceso
AND gen_isbn_pais(isbn) IN (SELECT pais FROM raizpaises)
AND NOT EXISTS
( SELECT isbn
FROM libros
WHERE isbn = librosdisponibilidadtemp.isbn)
AND NOT EXISTS
( SELECT isbn
FROM isbns_a_descubrir
WHERE isbn = librosdisponibilidadtemp.isbn);
and the plan execution is
Unique (cost=133558107.45..133558128.13 rows=414 width=21) (actual time= 790552.899..790553.098 rows=9 loops=1)
-> Sort (cost=133558107.45..133558112.62 rows=2068 width=21) (actual time=790552.882..790552.944 rows=9 loops=1)
Sort Key: isbn, now(), 1
-> Index Scan using librosdisponibilidadtemp_idx_proceso on librosdisponibilidadtemp (cost= 1.01..133557993.56 rows=2068 width=21) (actual time=5722.607..790552.588 rows=9 loops=1)
Index Cond: (proceso = 28465)
Filter: ((hashed subplan) AND (NOT (subplan)) AND (NOT (subplan)))
SubPlan
-> Seq Scan on isbns_a_descubrir (cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1 loops=2025)
Filter: ((isbn)::bpchar = $1)
-> Index Scan using "libros_idx_ISBN" on libros (cost=0.00..5.95 rows=1 width=21) (actual time=12.938..12.938 rows=1 loops=50512)
Index Cond: (isbn = $1)
-> Seq Scan on raizpaises (cost= 0.00..1.01 rows=1 width=10) (actual time=0.764..0.871 rows=1 loops=1)
Total runtime: 790553.561 ms
The libros table has 1200000 regs.
The isbns_a_descubrir table has 300000 regs.
The librosdisponibilidadtemp table has 50000 regs.
does anybody can explain me, why using index ibrosdisponibilidadtemp_idx_proceso is so slow and the others conditions are good enough
Thanks everybody
<meta content="text/html;charset=windows-1251" http-equiv="Content-Type"> <body alink="#ee0000" bgcolor="#ffffff" link="#0000ee" text="#000000" vlink="#551a8b"> Hi, The index doesn't cost you so much, seq SEQ Scan actully does: Seq Scan on isbns_a_descubrir (cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1 loops=2025) This seq scan is called once for every row of librosdisponibilidadtemp which passes the WHERE condition. So Here "Index Scan using librosdisponibilidadtemp_idx_proceso on librosdisponibilidadtemp (cost= 1.01..133557993.56 rows=2068 width=21) (actual time=5722.607..790552.588 rows=9 loops=1)" it says how much it will cost you to calculate the upper seq scan and the seq scan on (Seq Scan on raizpaises) and the index scan on libros. I suggest you to create index on table isbns_a_descubrir over column isbn. This will hurry the query. And use join instead of IN for table raizpaises. This should also save some time. Regards, Kaloyan Iliev <blockquote cite="midbd8b58a40702270528i7c395029v4a8bf3eb442bc8c9@mail.gmail.com" type="cite">ave this query: SELECT DISTINCT isbn, CURRENT_TIMESTAMP, 1 FROM librosdisponibilidadtemp WHERE proceso = ai_proceso AND gen_isbn_pais(isbn) IN (SELECT pais FROM raizpaises) AND NOT EXISTS ( SELECT isbn FROM libros WHERE isbn = librosdisponibilidadtemp.isbn) AND NOT EXISTS ( SELECT isbn FROM isbns_a_descubrir WHERE isbn = librosdisponibilidadtemp.isbn); and the plan execution is Unique (cost=133558107.45..133558128.13 rows=414 width=21) (actual time= 790552.899..790553.098 rows=9 loops=1) -> Sort (cost=133558107.45..133558112.62 rows=2068 width=21) (actual time=790552.882..790552.944 rows=9 loops=1) Sort Key: isbn, now(), 1 -> Index Scan using librosdisponibilidadtemp_idx_proceso on librosdisponibilidadtemp (cost= 1.01..133557993.56 rows=2068 width=21) (actual time=5722.607..790552.588 rows=9 loops=1) Index Cond: (proceso = 28465) Filter: ((hashed subplan) AND (NOT (subplan)) AND (NOT (subplan))) SubPlan -> Seq Scan on isbns_a_descubrir (cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1 loops=2025) Filter: ((isbn)::bpchar = $1) -> Index Scan using "libros_idx_ISBN" on libros (cost=0.00..5.95 rows=1 width=21) (actual time=12.938..12.938 rows=1 loops=50512) Index Cond: (isbn = $1) -> Seq Scan on raizpaises (cost= 0.00..1.01 rows=1 width=10) (actual time=0.764..0.871 rows=1 loops=1) Total runtime: 790553.561 ms The libros table has 1200000 regs. The isbns_a_descubrir table has 300000 regs. The librosdisponibilidadtemp table has 50000 regs. does anybody can explain me, why using index ibrosdisponibilidadtemp_idx_proceso is so slow and the others conditions are good enough Thanks everybody
<meta content="text/html;charset=windows-1251" http-equiv="Content-Type"> <body alink="#ee0000" bgcolor="#ffffff" link="#0000ee" text="#000000" vlink="#551a8b"> Hi, The index doesn't cost you so much, seq SEQ Scan actully does: Seq Scan on isbns_a_descubrir (cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1 loops=2025) This seq scan is called once for every row of librosdisponibilidadtemp which passes the WHERE condition. So Here "Index Scan using librosdisponibilidadtemp_idx_proceso on librosdisponibilidadtemp (cost= 1.01..133557993.56 rows=2068 width=21) (actual time=5722.607..790552.588 rows=9 loops=1)" it says how much it will cost you to calculate the upper seq scan and the seq scan on (Seq Scan on raizpaises) and the index scan on libros. I suggest you to create index on table isbns_a_descubrir over column isbn. This will hurry the query. And use join instead of IN for table raizpaises. This should also save some time. Regards, Kaloyan Iliev <blockquote cite="midbd8b58a40702270528i7c395029v4a8bf3eb442bc8c9@mail.gmail.com" type="cite">ave this query: SELECT DISTINCT isbn, CURRENT_TIMESTAMP, 1 FROM librosdisponibilidadtemp WHERE proceso = ai_proceso AND gen_isbn_pais(isbn) IN (SELECT pais FROM raizpaises) AND NOT EXISTS ( SELECT isbn FROM libros WHERE isbn = librosdisponibilidadtemp.isbn) AND NOT EXISTS ( SELECT isbn FROM isbns_a_descubrir WHERE isbn = librosdisponibilidadtemp.isbn); and the plan execution is Unique (cost=133558107.45..133558128.13 rows=414 width=21) (actual time= 790552.899..790553.098 rows=9 loops=1) -> Sort (cost=133558107.45..133558112.62 rows=2068 width=21) (actual time=790552.882..790552.944 rows=9 loops=1) Sort Key: isbn, now(), 1 -> Index Scan using librosdisponibilidadtemp_idx_proceso on librosdisponibilidadtemp (cost= 1.01..133557993.56 rows=2068 width=21) (actual time=5722.607..790552.588 rows=9 loops=1) Index Cond: (proceso = 28465) Filter: ((hashed subplan) AND (NOT (subplan)) AND (NOT (subplan))) SubPlan -> Seq Scan on isbns_a_descubrir (cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1 loops=2025) Filter: ((isbn)::bpchar = $1) -> Index Scan using "libros_idx_ISBN" on libros (cost=0.00..5.95 rows=1 width=21) (actual time=12.938..12.938 rows=1 loops=50512) Index Cond: (isbn = $1) -> Seq Scan on raizpaises (cost= 0.00..1.01 rows=1 width=10) (actual time=0.764..0.871 rows=1 loops=1) Total runtime: 790553.561 ms The libros table has 1200000 regs. The isbns_a_descubrir table has 300000 regs. The librosdisponibilidadtemp table has 50000 regs. does anybody can explain me, why using index ibrosdisponibilidadtemp_idx_proceso is so slow and the others conditions are good enough Thanks everybody
<meta content="text/html;charset=windows-1251" http-equiv="Content-Type"> <body alink="#ee0000" bgcolor="#ffffff" link="#0000ee" text="#000000" vlink="#551a8b"> Hi, The index doesn't cost you so much, seq SEQ Scan actully does: Seq Scan on isbns_a_descubrir (cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1 loops=2025) This seq scan is called once for every row of librosdisponibilidadtemp which passes the WHERE condition. So Here "Index Scan using librosdisponibilidadtemp_idx_proceso on librosdisponibilidadtemp (cost= 1.01..133557993.56 rows=2068 width=21) (actual time=5722.607..790552.588 rows=9 loops=1)" it says how much it will cost you to calculate the upper seq scan and the seq scan on (Seq Scan on raizpaises) and the index scan on libros. I suggest you to create index on table isbns_a_descubrir over column isbn. This will hurry the query. And use join instead of IN for table raizpaises. This should also save some time. Regards, Kaloyan Iliev <blockquote cite="midbd8b58a40702270528i7c395029v4a8bf3eb442bc8c9@mail.gmail.com" type="cite">ave this query: SELECT DISTINCT isbn, CURRENT_TIMESTAMP, 1 FROM librosdisponibilidadtemp WHERE proceso = ai_proceso AND gen_isbn_pais(isbn) IN (SELECT pais FROM raizpaises) AND NOT EXISTS ( SELECT isbn FROM libros WHERE isbn = librosdisponibilidadtemp.isbn) AND NOT EXISTS ( SELECT isbn FROM isbns_a_descubrir WHERE isbn = librosdisponibilidadtemp.isbn); and the plan execution is Unique (cost=133558107.45..133558128.13 rows=414 width=21) (actual time= 790552.899..790553.098 rows=9 loops=1) -> Sort (cost=133558107.45..133558112.62 rows=2068 width=21) (actual time=790552.882..790552.944 rows=9 loops=1) Sort Key: isbn, now(), 1 -> Index Scan using librosdisponibilidadtemp_idx_proceso on librosdisponibilidadtemp (cost= 1.01..133557993.56 rows=2068 width=21) (actual time=5722.607..790552.588 rows=9 loops=1) Index Cond: (proceso = 28465) Filter: ((hashed subplan) AND (NOT (subplan)) AND (NOT (subplan))) SubPlan -> Seq Scan on isbns_a_descubrir (cost=0.00..8067.91 rows=1 width=21) (actual time=30.044..30.044 rows=1 loops=2025) Filter: ((isbn)::bpchar = $1) -> Index Scan using "libros_idx_ISBN" on libros (cost=0.00..5.95 rows=1 width=21) (actual time=12.938..12.938 rows=1 loops=50512) Index Cond: (isbn = $1) -> Seq Scan on raizpaises (cost= 0.00..1.01 rows=1 width=10) (actual time=0.764..0.871 rows=1 loops=1) Total runtime: 790553.561 ms The libros table has 1200000 regs. The isbns_a_descubrir table has 300000 regs. The librosdisponibilidadtemp table has 50000 regs. does anybody can explain me, why using index ibrosdisponibilidadtemp_idx_proceso is so slow and the others conditions are good enough Thanks everybody