Thread: Improving speed of query

Improving speed of query

From
Leonardo M. Ramé
Date:
Hi, I'm using a query to fill a paginated table. The task involves
filtering, sorting, limit, offset and count of all rows (to determine
the number of pages).

My current query is this:

select count(*) over() as totalrows,
   case when (d.filepath is not null) then '1' else '0' end as HasDocument,
   e.idtask, e.site, e.pacs, e.studyuid, e.accessionnumber,
   e.patientemail, e.refphysicianemail, e.sent, e.password, e.created,
   e.institutionname, e.patientname, e.studydate, e.studytime,
   e.proceduredescription, e.performingphysician, e.referringphysician,
   e.informantphysician, e.forcesend, e.sentdate, e.md5identifier,
   e.read, e.patientid
from emailtasks e
join sites s on s.identifier = e.site
left join documents_current d on d.idtask=e.idtask
where s.idsite = 34
order by e.idtask desc
limit 10 offset 0;

I've made several indexes, and they really fast. The problem here is the
window function count(*) to get the total number of rows.

Here's the explain analyze result:

  Limit  (cost=0.84..57.98 rows=10 width=310) (actual
time=36075.589..36079.371 rows=10 loops=1)
    ->  WindowAgg  (cost=0.84..84302.61 rows=14754 width=310) (actual
time=36075.581..36079.356 rows=10 loops=1)
          ->  Nested Loop Left Join  (cost=0.84..84118.19 rows=14754
width=310) (actual time=0.085..30639.311 rows=258839 loops=1)
                ->  Nested Loop  (cost=0.42..39977.25 rows=10170
width=260) (actual time=0.071..10308.789 rows=146782 loops=1)
                      Join Filter: (e.site = s.identifier)
                      Rows Removed by Join Filter: 66794
                      ->  Index Scan using idx_emailtasks_idtask on
emailtasks e  (cost=0.42..36772.35 rows=213576 width=260) (actual
time=0.013..9929.527 rows=213576 loops=1)
                      ->  Materialize  (cost=0.00..1.27 rows=1 width=16)
(actual time=0.000..0.001 rows=1 loops=213576)
                            ->  Seq Scan on sites s (cost=0.00..1.26
rows=1 width=16) (actual time=0.024..0.026 rows=1 loops=1)
                                  Filter: (idsite = 34)
                                  Rows Removed by Filter: 20
                ->  Index Scan using idx_documents_current_idtask on
documents_current d (cost=0.42..4.32 rows=2 width=54) (actual
time=0.092..0.136 rows=2 loops=146782)
                      Index Cond: (idtask = e.idtask)
  Total runtime: 36106.813 ms
(14 rows)

What strategy do you recommend for speeding up this query?.

Regards,
--
Leonardo M. Ramé
Medical IT - Griensu S.A.
Av. Colón 636 - Piso 8 Of. A
X5000EPT -- Córdoba
Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19
Cel.: +54 9 (011) 40871877


Re: Improving speed of query

From
Nicolas Paris
Date:
Hi,

You could run 2 queries separatly and asynchrouneously
1)  the limit 10 
2)  the count

While the limit 10 query would be showned instanteneously, the web table would way for the count to build the pagination 

Le lun. 26 sept. 2016 à 20:59, Leonardo M. Ramé <l.rame@griensu.com> a écrit :
Hi, I'm using a query to fill a paginated table. The task involves
filtering, sorting, limit, offset and count of all rows (to determine
the number of pages).

My current query is this:

select count(*) over() as totalrows,
   case when (d.filepath is not null) then '1' else '0' end as HasDocument,
   e.idtask, e.site, e.pacs, e.studyuid, e.accessionnumber,
   e.patientemail, e.refphysicianemail, e.sent, e.password, e.created,
   e.institutionname, e.patientname, e.studydate, e.studytime,
   e.proceduredescription, e.performingphysician, e.referringphysician,
   e.informantphysician, e.forcesend, e.sentdate, e.md5identifier,
   e.read, e.patientid
from emailtasks e
join sites s on s.identifier = e.site
left join documents_current d on d.idtask=e.idtask
where s.idsite = 34
order by e.idtask desc
limit 10 offset 0;

I've made several indexes, and they really fast. The problem here is the
window function count(*) to get the total number of rows.

Here's the explain analyze result:

  Limit  (cost=0.84..57.98 rows=10 width=310) (actual
time=36075.589..36079.371 rows=10 loops=1)
    ->  WindowAgg  (cost=0.84..84302.61 rows=14754 width=310) (actual
time=36075.581..36079.356 rows=10 loops=1)
          ->  Nested Loop Left Join  (cost=0.84..84118.19 rows=14754
width=310) (actual time=0.085..30639.311 rows=258839 loops=1)
                ->  Nested Loop  (cost=0.42..39977.25 rows=10170
width=260) (actual time=0.071..10308.789 rows=146782 loops=1)
                      Join Filter: (e.site = s.identifier)
                      Rows Removed by Join Filter: 66794
                      ->  Index Scan using idx_emailtasks_idtask on
emailtasks e  (cost=0.42..36772.35 rows=213576 width=260) (actual
time=0.013..9929.527 rows=213576 loops=1)
                      ->  Materialize  (cost=0.00..1.27 rows=1 width=16)
(actual time=0.000..0.001 rows=1 loops=213576)
                            ->  Seq Scan on sites s (cost=0.00..1.26
rows=1 width=16) (actual time=0.024..0.026 rows=1 loops=1)
                                  Filter: (idsite = 34)
                                  Rows Removed by Filter: 20
                ->  Index Scan using idx_documents_current_idtask on
documents_current d (cost=0.42..4.32 rows=2 width=54) (actual
time=0.092..0.136 rows=2 loops=146782)
                      Index Cond: (idtask = e.idtask)
  Total runtime: 36106.813 ms
(14 rows)

What strategy do you recommend for speeding up this query?.

Regards,
--
Leonardo M. Ramé
Medical IT - Griensu S.A.
Av. Colón 636 - Piso 8 Of. A
X5000EPT -- Córdoba
Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19
Cel.: +54 9 (011) 40871877


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general