complex query performance assistance request - Mailing list pgsql-performance
From | John Mendenhall |
---|---|
Subject | complex query performance assistance request |
Date | |
Msg-id | 20050821034841.GA28968@calvin.surfutopia.net Whole thread Raw |
Responses |
Re: complex query performance assistance request
|
List | pgsql-performance |
I need to improve the performance for the following query. Soon after I reboot my server, the following query takes 20 seconds the first time I run it. When I run it after that, it takes approximately 2 seconds. I understand the caching taking place (at the os or db level, it doesn't matter here). Here are the results of the explain analyze run: ----- LOG: duration: 6259.632 ms statement: explain analyze SELECT c.id AS contact_id, sr.id AS sales_rep_id, LTRIM(RTRIM(sr.firstname || ' ' || sr.lastname)) AS sales_rep_name, p.id AS partner_id, p.company AS partner_company, coalesce(LTRIM(RTRIM(c.company)), LTRIM(RTRIM(c.firstname || ' ' || c.lastname))) AS contact_company, LTRIM(RTRIM(c.city || ' ' || c.state || ' ' || c.postalcode || ' ' || c.country)) AS contact_location, c.phone AS contact_phone, c.email AS contact_email, co.name AS contact_country, TO_CHAR(c.request_status_last_modified, 'mm/dd/yy hh12:mi pm') AS request_status_last_modified, TO_CHAR(c.request_status_last_modified, 'yyyymmddhh24miss') AS rqst_stat_last_mdfd_sortable, c.token_id, c.master_key_token AS token FROM sales_reps sr JOIN partners p ON (sr.id = p.sales_rep_id) JOIN contacts c ON (p.id = c.partner_id) JOIN countries co ON (LOWER(c.country) = LOWER(co.code)) JOIN partner_classification pc ON (p.classification_id = pc.id AND pc.classification != 'Sales Rep') WHERE c.lead_deleted IS NULL AND EXISTS ( SELECT lr.id FROM lead_requests lr, lead_request_status lrs WHERE c.id = lr.contact_id AND lr.status_id = lrs.id AND lrs.is_closed = 0 ) ORDER BY contact_company, contact_id; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------- Sort (cost=39093.16..39102.80 rows=3856 width=238) (actual time=6220.481..6221.188 rows=1071 loops=1) Sort Key: COALESCE(ltrim(rtrim((c.company)::text)), ltrim(rtrim((((c.firstname)::text || ' '::text) || (c.lastname)::text)))),c.id -> Merge Join (cost=38580.89..38863.48 rows=3856 width=238) (actual time=6015.751..6184.199 rows=1071 loops=1) Merge Cond: ("outer"."?column3?" = "inner"."?column19?") -> Sort (cost=14.00..14.61 rows=242 width=19) (actual time=9.250..9.500 rows=240 loops=1) Sort Key: lower((co.code)::text) -> Seq Scan on countries co (cost=0.00..4.42 rows=242 width=19) (actual time=0.132..4.498 rows=242 loops=1) -> Sort (cost=38566.89..38574.86 rows=3186 width=225) (actual time=6005.644..6006.954 rows=1071 loops=1) Sort Key: lower((c.country)::text) -> Merge Join (cost=75.65..38381.50 rows=3186 width=225) (actual time=58.086..5979.287 rows=1071 loops=1) Merge Cond: ("outer".partner_id = "inner".id) -> Index Scan using contacts_partner_id_idx on contacts c (cost=0.00..160907.39 rows=20106 width=171)(actual time=2.569..5816.985 rows=1547 loops=1) Filter: ((lead_deleted IS NULL) AND (subplan)) SubPlan -> Nested Loop (cost=1.16..6.56 rows=2 width=10) (actual time=0.119..0.119 rows=0 loops=40261) Join Filter: ("outer".status_id = "inner".id) -> Index Scan using lead_requests_contact_id_idx on lead_requests lr (cost=0.00..4.86rows=3 width=20) (actual time=0.079..0.083 rows=0 loops=40261) Index Cond: ($0 = contact_id) -> Materialize (cost=1.16..1.24 rows=8 width=10) (actual time=0.002..0.011 rows=6 loops=12592) -> Seq Scan on lead_request_status lrs (cost=0.00..1.16 rows=8 width=10) (actualtime=0.083..0.270 rows=7 loops=1) Filter: (is_closed = 0::numeric) -> Sort (cost=75.65..76.37 rows=290 width=64) (actual time=55.073..56.990 rows=1334 loops=1) Sort Key: p.id -> Merge Join (cost=59.24..63.79 rows=290 width=64) (actual time=31.720..41.096 rows=395 loops=1) Merge Cond: ("outer".id = "inner".sales_rep_id) -> Sort (cost=2.42..2.52 rows=39 width=31) (actual time=1.565..1.616 rows=39 loops=1) Sort Key: sr.id -> Seq Scan on sales_reps sr (cost=0.00..1.39 rows=39 width=31) (actual time=0.043..0.581rows=39 loops=1) -> Sort (cost=56.82..57.55 rows=290 width=43) (actual time=29.921..30.310 rows=395 loops=1) Sort Key: p.sales_rep_id -> Nested Loop (cost=24.35..44.96 rows=290 width=43) (actual time=0.169..22.566rows=395 loops=1) Join Filter: ("inner".classification_id = "outer".id) -> Seq Scan on partner_classification pc (cost=0.00..1.04 rows=2 width=10)(actual time=0.059..0.102 rows=2 loops=1) Filter: ((classification)::text <> 'Sales Rep'::text) -> Materialize (cost=24.35..28.70 rows=435 width=53) (actual time=0.023..5.880rows=435 loops=2) -> Seq Scan on partners p (cost=0.00..24.35 rows=435 width=53) (actualtime=0.034..8.937 rows=435 loops=1) Total runtime: 6225.791 ms (37 rows) ----- My first question is, what is the Materialize query plan element? It happens twice, and usually when I see it, my query is slow. My second and more important question is, does anyone have any ideas or suggestions as to how I can increase the speed for this query? Things I have already done are, modify the joins and conditions so it starts with smaller tables, thus the join set is smaller, modify the configuration of the server to ensure index scans are used as they should be, ran vacuumdb and analyze on the database. Thank you very much in advance for any pointers for additional places I can look. Thanks. JohnM -- John Mendenhall john@surfutopia.net surf utopia internet services
pgsql-performance by date: