Re: complex query performance assistance request - Mailing list pgsql-performance

From John Mendenhall
Subject Re: complex query performance assistance request
Date
Msg-id 20050822182138.GA30818@calvin.surfutopia.net
Whole thread Raw
In response to complex query performance assistance request  (John Mendenhall <john@surfutopia.net>)
Responses Re: complex query performance assistance request
List pgsql-performance
On Sat, 20 Aug 2005, John Mendenhall wrote:

> I need to improve the performance for the following
> query.

I have run the same query in the same database under
different schemas.  Each schema is pretty much the same
tables and indices.  One has an extra backup table and
an extra index which are not used in either of the explain
analyze plans.

The first schema is a development schema, which I used
to performance tune the server so everything was great.

Here are the current results of the sql run in the development
environment:

-----
LOG:  duration: 852.275 ms  statement: explain analyze
SELECT
 c.id AS contact_id,
 sr.id AS sales_rep_id,
 p.id AS partner_id,
 coalesce(LTRIM(RTRIM(c.company)), LTRIM(RTRIM(c.firstname || ' ' || c.lastname))) AS contact_company,
 co.name AS contact_country,
 c.master_key_token
FROM
 sales_reps sr
 JOIN partners p ON (sr.id = p.sales_rep_id)
 JOIN contacts c ON (p.id = c.partner_id)
 JOIN countries co ON (LOWER(c.country) = LOWER(co.code))
 JOIN partner_classification pc ON (p.classification_id = pc.id AND pc.classification != 'Sales Rep')
WHERE
 c.lead_deleted IS NULL
 AND EXISTS
 (
  SELECT
   lr.id
  FROM
   lead_requests lr,
   lead_request_status lrs
  WHERE
   c.id = lr.contact_id AND
   lr.status_id = lrs.id  AND
   lrs.is_closed = 0
 )
ORDER BY
 contact_company, contact_id
                                                                                                   QUERY PLAN
        


--------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------
 Sort  (cost=18238.25..18238.27 rows=11 width=102) (actual time=823.721..823.915 rows=247 loops=1)
   Sort Key: COALESCE(ltrim(rtrim((c.company)::text)), ltrim(rtrim((((c.firstname)::text || ' '::text) ||
(c.lastname)::text)))),c.id 
   ->  Hash Join  (cost=18230.34..18238.06 rows=11 width=102) (actual time=808.042..818.427 rows=247 loops=1)
         Hash Cond: (lower(("outer".code)::text) = lower(("inner".country)::text))
         ->  Seq Scan on countries co  (cost=0.00..4.42 rows=242 width=19) (actual time=0.032..1.208 rows=242 loops=1)
         ->  Hash  (cost=18230.31..18230.31 rows=9 width=95) (actual time=807.554..807.554 rows=0 loops=1)
               ->  Merge Join  (cost=18229.98..18230.31 rows=9 width=95) (actual time=794.413..804.855 rows=247
loops=1)
                     Merge Cond: ("outer".sales_rep_id = "inner".id)
                     ->  Sort  (cost=18227.56..18227.59 rows=9 width=95) (actual time=793.132..793.502 rows=250
loops=1)
                           Sort Key: p.sales_rep_id
                           ->  Merge Join  (cost=18227.26..18227.42 rows=9 width=95) (actual time=782.832..789.205
rows=250loops=1) 
                                 Merge Cond: ("outer".id = "inner".classification_id)
                                 ->  Sort  (cost=1.05..1.05 rows=2 width=10) (actual time=0.189..0.194 rows=2 loops=1)
                                       Sort Key: pc.id
                                       ->  Seq Scan on partner_classification pc  (cost=0.00..1.04 rows=2 width=10)
(actualtime=0.089..0.127 rows=2 loops=1) 
                                             Filter: ((classification)::text <> 'Sales Rep'::text)
                                 ->  Sort  (cost=18226.21..18226.24 rows=13 width=105) (actual time=782.525..782.818
rows=251loops=1) 
                                       Sort Key: p.classification_id
                                       ->  Merge Join  (cost=0.00..18225.97 rows=13 width=105) (actual
time=54.135..776.299rows=449 loops=1) 
                                             Merge Cond: ("outer".id = "inner".partner_id)
                                             ->  Index Scan using partners_pkey on partners p  (cost=0.00..30.80
rows=395width=30) (actual time=0.073..6.873 rows=395 loops=1) 
                                             ->  Index Scan using contacts_partner_id_idx on contacts c
(cost=0.00..130157.20rows=93 width=85) (actual time=0.366..739.783 rows=453 loops=1) 
                                                   Filter: ((lead_deleted IS NULL) AND (subplan))
                                                   SubPlan
                                                     ->  Nested Loop  (cost=0.00..6.75 rows=2 width=10) (actual
time=0.103..0.103rows=0 loops=5576) 
                                                           Join Filter: ("outer".status_id = "inner".id)
                                                           ->  Index Scan using lead_requests_contact_id_idx on
lead_requestslr (cost=0.00..4.23 rows=2 width=20) (actual time=0.075..0.075 rows=0 loops=5576) 
                                                                 Index Cond: ($0 = contact_id)
                                                           ->  Seq Scan on lead_request_status lrs  (cost=0.00..1.16
rows=8width=10) (actual time=0.028..0.098 rows=4 loops=522) 
                                                                 Filter: (is_closed = 0::numeric)
                     ->  Sort  (cost=2.42..2.52 rows=39 width=10) (actual time=1.183..1.569 rows=268 loops=1)
                           Sort Key: sr.id
                           ->  Seq Scan on sales_reps sr  (cost=0.00..1.39 rows=39 width=10) (actual time=0.056..0.353
rows=39loops=1) 
 Total runtime: 826.425 ms
(34 rows)
-----

Here is the current run in the production environment,
which I need to figure out how to get to the performance
level of the development environment:

-----
LOG:  duration: 6447.934 ms  statement: explain analyze
SELECT
 c.id AS contact_id,
 sr.id AS sales_rep_id,
 p.id AS partner_id,
 coalesce(LTRIM(RTRIM(c.company)), LTRIM(RTRIM(c.firstname || ' ' || c.lastname))) AS contact_company,
 co.name AS contact_country,
 c.master_key_token
FROM
 sales_reps sr
 JOIN partners p ON (sr.id = p.sales_rep_id)
 JOIN contacts c ON (p.id = c.partner_id)
 JOIN countries co ON (LOWER(c.country) = LOWER(co.code))
 JOIN partner_classification pc ON (p.classification_id = pc.id AND pc.classification != 'Sales Rep')
WHERE
 c.lead_deleted IS NULL
 AND EXISTS
 (
  SELECT
   lr.id
  FROM
   lead_requests lr,
   lead_request_status lrs
  WHERE
   c.id = lr.contact_id AND
   lr.status_id = lrs.id  AND
   lrs.is_closed = 0
 )
ORDER BY
 contact_company, contact_id
                                                                                        QUERY PLAN
        


--------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------
 Sort  (cost=40838.98..40849.08 rows=4042 width=102) (actual time=6418.732..6419.536 rows=1071 loops=1)
   Sort Key: COALESCE(ltrim(rtrim((c.company)::text)), ltrim(rtrim((((c.firstname)::text || ' '::text) ||
(c.lastname)::text)))),c.id 
   ->  Merge Join  (cost=40442.25..40596.85 rows=4042 width=102) (actual time=6357.161..6389.616 rows=1071 loops=1)
         Merge Cond: ("outer"."?column3?" = "inner"."?column9?")
         ->  Sort  (cost=14.00..14.61 rows=242 width=19) (actual time=9.753..10.018 rows=240 loops=1)
               Sort Key: lower((co.code)::text)
               ->  Seq Scan on countries co  (cost=0.00..4.42 rows=242 width=19) (actual time=0.126..3.950 rows=242
loops=1)
         ->  Sort  (cost=40428.24..40436.59 rows=3340 width=95) (actual time=6347.154..6348.429 rows=1071 loops=1)
               Sort Key: lower((c.country)::text)
               ->  Merge Join  (cost=75.65..40232.76 rows=3340 width=95) (actual time=60.308..6331.266 rows=1071
loops=1)
                     Merge Cond: ("outer".partner_id = "inner".id)
                     ->  Index Scan using contacts_partner_id_idx on contacts c  (cost=0.00..161018.18 rows=20120
width=85)(actual time=2.769..6188.886 rows=1548 loops=1) 
                           Filter: ((lead_deleted IS NULL) AND (subplan))
                           SubPlan
                             ->  Nested Loop  (cost=1.16..6.57 rows=2 width=10) (actual time=0.129..0.129 rows=0
loops=40262)
                                   Join Filter: ("outer".status_id = "inner".id)
                                   ->  Index Scan using lead_requests_contact_id_idx on lead_requests lr
(cost=0.00..4.86rows=3 width=20) (actual time=0.086..0.092 rows=0 loops=40262) 
                                         Index Cond: ($0 = contact_id)
                                   ->  Materialize  (cost=1.16..1.24 rows=8 width=10) (actual time=0.002..0.013 rows=6
loops=12593)
                                         ->  Seq Scan on lead_request_status lrs  (cost=0.00..1.16 rows=8 width=10)
(actualtime=0.078..0.243 rows=7 loops=1) 
                                               Filter: (is_closed = 0::numeric)
                     ->  Sort  (cost=75.65..76.37 rows=290 width=20) (actual time=57.243..59.574 rows=1334 loops=1)
                           Sort Key: p.id
                           ->  Merge Join  (cost=59.24..63.79 rows=290 width=20) (actual time=33.975..42.215 rows=395
loops=1)
                                 Merge Cond: ("outer".id = "inner".sales_rep_id)
                                 ->  Sort  (cost=2.42..2.52 rows=39 width=10) (actual time=1.206..1.285 rows=39
loops=1)
                                       Sort Key: sr.id
                                       ->  Seq Scan on sales_reps sr  (cost=0.00..1.39 rows=39 width=10) (actual
time=0.028..0.365rows=39 loops=1) 
                                 ->  Sort  (cost=56.82..57.55 rows=290 width=20) (actual time=32.566..33.254 rows=395
loops=1)
                                       Sort Key: p.sales_rep_id
                                       ->  Nested Loop  (cost=24.35..44.96 rows=290 width=20) (actual
time=0.158..25.227rows=395 loops=1) 
                                             Join Filter: ("inner".classification_id = "outer".id)
                                             ->  Seq Scan on partner_classification pc  (cost=0.00..1.04 rows=2
width=10)(actual time=0.050..0.096 rows=2 loops=1) 
                                                   Filter: ((classification)::text <> 'Sales Rep'::text)
                                             ->  Materialize  (cost=24.35..28.70 rows=435 width=30) (actual
time=0.028..6.617rows=435 loops=2) 
                                                   ->  Seq Scan on partners p  (cost=0.00..24.35 rows=435 width=30)
(actualtime=0.042..9.941 rows=435 loops=1) 
 Total runtime: 6423.683 ms
(37 rows)
-----

The SQL is exactly the same.

The issue is the query plan is different, and thus,
not up to the performance we need.

We have 256meg in the machine.  Would it help if
we threw some more memory in?

Please let me know if you have *any* pointers as to
the reason for the difference.

Thank you very much in advance for any pointers or
suggestions.

JohnM

--
John Mendenhall
john@surfutopia.net
surf utopia
internet services

pgsql-performance by date:

Previous
From: "Merlin Moncure"
Date:
Subject: Re: Finding bottleneck
Next
From: Josh Berkus
Date:
Subject: Re: MemoryContextSwitchTo during table scan?