Improving speed of query - Mailing list pgsql-general

From Leonardo M. Ramé
Subject Improving speed of query
Date
Msg-id 2824957f-c37e-f1a5-965d-de4cb73c53ad@griensu.com
Whole thread Raw
Responses Re: Improving speed of query  (Nicolas Paris <niparisco@gmail.com>)
List pgsql-general
Hi, I'm using a query to fill a paginated table. The task involves
filtering, sorting, limit, offset and count of all rows (to determine
the number of pages).

My current query is this:

select count(*) over() as totalrows,
   case when (d.filepath is not null) then '1' else '0' end as HasDocument,
   e.idtask, e.site, e.pacs, e.studyuid, e.accessionnumber,
   e.patientemail, e.refphysicianemail, e.sent, e.password, e.created,
   e.institutionname, e.patientname, e.studydate, e.studytime,
   e.proceduredescription, e.performingphysician, e.referringphysician,
   e.informantphysician, e.forcesend, e.sentdate, e.md5identifier,
   e.read, e.patientid
from emailtasks e
join sites s on s.identifier = e.site
left join documents_current d on d.idtask=e.idtask
where s.idsite = 34
order by e.idtask desc
limit 10 offset 0;

I've made several indexes, and they really fast. The problem here is the
window function count(*) to get the total number of rows.

Here's the explain analyze result:

  Limit  (cost=0.84..57.98 rows=10 width=310) (actual
time=36075.589..36079.371 rows=10 loops=1)
    ->  WindowAgg  (cost=0.84..84302.61 rows=14754 width=310) (actual
time=36075.581..36079.356 rows=10 loops=1)
          ->  Nested Loop Left Join  (cost=0.84..84118.19 rows=14754
width=310) (actual time=0.085..30639.311 rows=258839 loops=1)
                ->  Nested Loop  (cost=0.42..39977.25 rows=10170
width=260) (actual time=0.071..10308.789 rows=146782 loops=1)
                      Join Filter: (e.site = s.identifier)
                      Rows Removed by Join Filter: 66794
                      ->  Index Scan using idx_emailtasks_idtask on
emailtasks e  (cost=0.42..36772.35 rows=213576 width=260) (actual
time=0.013..9929.527 rows=213576 loops=1)
                      ->  Materialize  (cost=0.00..1.27 rows=1 width=16)
(actual time=0.000..0.001 rows=1 loops=213576)
                            ->  Seq Scan on sites s (cost=0.00..1.26
rows=1 width=16) (actual time=0.024..0.026 rows=1 loops=1)
                                  Filter: (idsite = 34)
                                  Rows Removed by Filter: 20
                ->  Index Scan using idx_documents_current_idtask on
documents_current d (cost=0.42..4.32 rows=2 width=54) (actual
time=0.092..0.136 rows=2 loops=146782)
                      Index Cond: (idtask = e.idtask)
  Total runtime: 36106.813 ms
(14 rows)

What strategy do you recommend for speeding up this query?.

Regards,
--
Leonardo M. Ramé
Medical IT - Griensu S.A.
Av. Colón 636 - Piso 8 Of. A
X5000EPT -- Córdoba
Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19
Cel.: +54 9 (011) 40871877


pgsql-general by date:

Previous
From: Adam Brusselback
Date:
Subject: Re: Incrementally refreshed materialized view
Next
From: Kevin Grittner
Date:
Subject: Re: Incrementally refreshed materialized view