Re: complex query performance assistance request - Mailing list pgsql-performance
From | John Mendenhall |
---|---|
Subject | Re: complex query performance assistance request |
Date | |
Msg-id | 20050823190525.GA27623@calvin.surfutopia.net Whole thread Raw |
In response to | Re: complex query performance assistance request (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-performance |
Tom, > > Would it be best to attempt to rewrite it for IN? > > Or, should we try to tie it in with a join? > > Couldn't say without a deeper understanding of what you're trying to > accomplish. Here are the results of each SQL rewrite. The first pass, I rewrote it as c.id IN (): ----- LOG: duration: 2669.682 ms statement: explain analyze SELECT c.id AS contact_id, sr.id AS sales_rep_id, p.id AS partner_id, coalesce(LTRIM(RTRIM(c.company)), LTRIM(RTRIM(c.firstname || ' ' || c.lastname))) AS contact_company, co.name AS contact_country, c.master_key_token FROM sales_reps sr JOIN partners p ON (sr.id = p.sales_rep_id) JOIN contacts c ON (p.id = c.partner_id) JOIN countries co ON (LOWER(c.country) = LOWER(co.code)) JOIN partner_classification pc ON (p.classification_id = pc.id AND pc.classification != 'Sales Rep') WHERE c.lead_deleted IS NULL AND c.id IN ( SELECT lr.contact_id FROM lead_requests lr, lead_request_status lrs WHERE lr.status_id = lrs.id AND lrs.is_closed = 0 ) ORDER BY contact_company, contact_id QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------- Sort (cost=4413.35..4416.16 rows=1123 width=102) (actual time=2617.069..2617.719 rows=1071 loops=1) Sort Key: COALESCE(ltrim(rtrim((c.company)::text)), ltrim(rtrim((((c.firstname)::text || ' '::text) || (c.lastname)::text)))),c.id -> Merge Join (cost=4311.31..4356.45 rows=1123 width=102) (actual time=2549.717..2589.398 rows=1071 loops=1) Merge Cond: ("outer"."?column3?" = "inner"."?column9?") -> Sort (cost=14.00..14.61 rows=242 width=19) (actual time=9.765..9.966 rows=240 loops=1) Sort Key: lower((co.code)::text) -> Seq Scan on countries co (cost=0.00..4.42 rows=242 width=19) (actual time=0.142..5.118 rows=242 loops=1) -> Sort (cost=4297.31..4299.63 rows=928 width=95) (actual time=2539.685..2540.913 rows=1071 loops=1) Sort Key: lower((c.country)::text) -> Merge IN Join (cost=4163.02..4251.57 rows=928 width=95) (actual time=2377.539..2524.844 rows=1071 loops=1) Merge Cond: ("outer".id = "inner".contact_id) -> Sort (cost=1835.53..1851.27 rows=6296 width=95) (actual time=1843.866..1853.193 rows=6349 loops=1) Sort Key: c.id -> Merge Join (cost=75.65..1438.24 rows=6296 width=95) (actual time=51.713..1505.633 rows=6349loops=1) Merge Cond: ("outer".partner_id = "inner".id) -> Index Scan using contacts_partner_id_idx on contacts c (cost=0.00..5303.84 rows=40243width=85) (actual time=0.077..584.736 rows=40267 loops=1) Filter: (lead_deleted IS NULL) -> Sort (cost=75.65..76.37 rows=290 width=20) (actual time=51.508..62.288 rows=6462 loops=1) Sort Key: p.id -> Merge Join (cost=59.24..63.79 rows=290 width=20) (actual time=30.152..38.281rows=395 loops=1) Merge Cond: ("outer".id = "inner".sales_rep_id) -> Sort (cost=2.42..2.52 rows=39 width=10) (actual time=1.390..1.505 rows=39loops=1) Sort Key: sr.id -> Seq Scan on sales_reps sr (cost=0.00..1.39 rows=39 width=10) (actualtime=0.026..0.380 rows=39 loops=1) -> Sort (cost=56.82..57.55 rows=290 width=20) (actual time=28.558..29.120rows=395 loops=1) Sort Key: p.sales_rep_id -> Nested Loop (cost=24.35..44.96 rows=290 width=20) (actual time=0.191..21.408rows=395 loops=1) Join Filter: ("inner".classification_id = "outer".id) -> Seq Scan on partner_classification pc (cost=0.00..1.04 rows=2width=10) (actual time=0.068..0.121 rows=2 loops=1) Filter: ((classification)::text <> 'Sales Rep'::text) -> Materialize (cost=24.35..28.70 rows=435 width=30) (actual time=0.029..5.380rows=435 loops=2) -> Seq Scan on partners p (cost=0.00..24.35 rows=435 width=30)(actual time=0.038..8.161 rows=435 loops=1) -> Sort (cost=2327.50..2351.43 rows=9573 width=11) (actual time=533.508..535.629 rows=1742 loops=1) Sort Key: lr.contact_id -> Merge Join (cost=1520.94..1694.49 rows=9573 width=11) (actual time=302.932..461.644 rows=1745loops=1) Merge Cond: ("outer".id = "inner".status_id) -> Sort (cost=1.28..1.30 rows=8 width=10) (actual time=0.392..0.404 rows=7 loops=1) Sort Key: lrs.id -> Seq Scan on lead_request_status lrs (cost=0.00..1.16 rows=8 width=10) (actualtime=0.117..0.280 rows=7 loops=1) Filter: (is_closed = 0::numeric) -> Sort (cost=1519.66..1558.55 rows=15556 width=21) (actual time=302.423..321.939 rows=15387loops=1) Sort Key: lr.status_id -> Seq Scan on lead_requests lr (cost=0.00..436.56 rows=15556 width=21) (actualtime=0.029..164.708 rows=15559 loops=1) Total runtime: 2632.987 ms (44 rows) ----- The second pass, I rewrote it to tie in with a JOIN, adding a DISTINCT at the top to get rid of the duplicates: ----- LOG: duration: 3285.645 ms statement: explain analyze SELECT DISTINCT c.id AS contact_id, sr.id AS sales_rep_id, p.id AS partner_id, coalesce(LTRIM(RTRIM(c.company)), LTRIM(RTRIM(c.firstname || ' ' || c.lastname))) AS contact_company, co.name AS contact_country, c.master_key_token FROM sales_reps sr JOIN partners p ON (sr.id = p.sales_rep_id) JOIN contacts c ON (p.id = c.partner_id) JOIN countries co ON (LOWER(c.country) = LOWER(co.code)) JOIN partner_classification pc ON (p.classification_id = pc.id AND pc.classification != 'Sales Rep') JOIN lead_requests lr ON (c.id = lr.contact_id) JOIN lead_request_status lrs ON (lr.status_id = lrs.id AND lrs.is_closed = 0) WHERE c.lead_deleted IS NULL ORDER BY contact_company, contact_id QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------- Unique (cost=3039.78..3071.46 rows=1810 width=102) (actual time=3219.707..3228.637 rows=1071 loops=1) -> Sort (cost=3039.78..3044.31 rows=1810 width=102) (actual time=3219.695..3220.560 rows=1118 loops=1) Sort Key: COALESCE(ltrim(rtrim((c.company)::text)), ltrim(rtrim((((c.firstname)::text || ' '::text) || (c.lastname)::text)))),c.id, sr.id, p.id, co.name, c.master_key_token -> Merge Join (cost=2870.92..2941.85 rows=1810 width=102) (actual time=3156.788..3188.338 rows=1118 loops=1) Merge Cond: ("outer"."?column3?" = "inner"."?column9?") -> Sort (cost=14.00..14.61 rows=242 width=19) (actual time=9.196..9.445 rows=240 loops=1) Sort Key: lower((co.code)::text) -> Seq Scan on countries co (cost=0.00..4.42 rows=242 width=19) (actual time=0.128..3.914 rows=242loops=1) -> Sort (cost=2856.92..2860.66 rows=1496 width=95) (actual time=3147.340..3148.477 rows=1118 loops=1) Sort Key: lower((c.country)::text) -> Merge Join (cost=2750.88..2778.03 rows=1496 width=95) (actual time=3008.933..3132.122 rows=1118loops=1) Merge Cond: ("outer".id = "inner".status_id) -> Sort (cost=1.28..1.30 rows=8 width=10) (actual time=0.366..0.379 rows=7 loops=1) Sort Key: lrs.id -> Seq Scan on lead_request_status lrs (cost=0.00..1.16 rows=8 width=10) (actual time=0.094..0.254rows=7 loops=1) Filter: (is_closed = 0::numeric) -> Sort (cost=2749.60..2755.67 rows=2430 width=105) (actual time=3008.396..3023.502 rows=9992loops=1) Sort Key: lr.status_id -> Merge Join (cost=1835.53..2612.95 rows=2430 width=105) (actual time=1975.714..2912.632rows=10089 loops=1) Merge Cond: ("outer".contact_id = "inner".id) -> Index Scan using lead_requests_contact_id_idx on lead_requests lr (cost=0.00..683.87rows=15556 width=21) (actual time=0.073..247.148 rows=15556 loops=1) -> Sort (cost=1835.53..1851.27 rows=6296 width=95) (actual time=1975.273..1988.664rows=10089 loops=1) Sort Key: c.id -> Merge Join (cost=75.65..1438.24 rows=6296 width=95) (actual time=56.107..1625.186rows=6349 loops=1) Merge Cond: ("outer".partner_id = "inner".id) -> Index Scan using contacts_partner_id_idx on contacts c (cost=0.00..5303.84rows=40243 width=85) (actual time=0.047..580.311 rows=40267 loops=1) Filter: (lead_deleted IS NULL) -> Sort (cost=75.65..76.37 rows=290 width=20) (actual time=55.935..65.502rows=6462 loops=1) Sort Key: p.id -> Merge Join (cost=59.24..63.79 rows=290 width=20) (actual time=31.765..39.925rows=395 loops=1) Merge Cond: ("outer".id = "inner".sales_rep_id) -> Sort (cost=2.42..2.52 rows=39 width=10) (actual time=1.072..1.117rows=39 loops=1) Sort Key: sr.id -> Seq Scan on sales_reps sr (cost=0.00..1.39 rows=39width=10) (actual time=0.022..0.312 rows=39 loops=1) -> Sort (cost=56.82..57.55 rows=290 width=20) (actual time=30.489..30.893rows=395 loops=1) Sort Key: p.sales_rep_id -> Nested Loop (cost=24.35..44.96 rows=290 width=20)(actual time=0.159..23.356 rows=395 loops=1) Join Filter: ("inner".classification_id = "outer".id) -> Seq Scan on partner_classification pc (cost=0.00..1.04rows=2 width=10) (actual time=0.047..0.086 rows=2 loops=1) Filter: ((classification)::text <> 'SalesRep'::text) -> Materialize (cost=24.35..28.70 rows=435 width=30)(actual time=0.028..6.124 rows=435 loops=2) -> Seq Scan on partners p (cost=0.00..24.35rows=435 width=30) (actual time=0.039..9.383 rows=435 loops=1) Total runtime: 3241.139 ms (43 rows) ----- The DISTINCT ON condition was about the same amount of time, statistically. Removing the DISTINCT entirely only gave a very slight improvement in performance. So, the bottom line is, unless there are other ideas to improve the performance, I will most likely rewrite our application to use the c.id IN () option. Thank you very much for your input and suggestions. JohnM -- John Mendenhall john@surfutopia.net surf utopia internet services
pgsql-performance by date: