query optimization question - Mailing list pgsql-performance

From Jack Coates
Subject query optimization question
Date
Msg-id 1075338353.7494.71.camel@cletus.lyris.com
Whole thread Raw
Responses Re: query optimization question
List pgsql-performance
Hi all,

I've got a query that needs some help, please. Is there a way to avoid
all the looping? I've got freedom to work with the double-indented
sections below ) AND (, but the initial select distinct wrapper is much
more difficult to change. This is auto-generated code.

explain analyze SELECT DISTINCT members_.emailaddr_, members_.memberid_
FROM members_  WHERE (
    members_.List_='list1'
    AND members_.MemberType_='normal'
    AND members_.SubType_='mail'
    AND members_.emailaddr_ IS NOT NULL
    ) AND (
        ( select count(*) from lyrActiveRecips, members_ a, outmail_
        where lyrActiveRecips.UserName = a.UserNameLC_
        and lyrActiveRecips.Domain = a.Domain_
        and a.MemberID_ = members_.MemberID_
        and outmail_.MessageID_ = lyrActiveRecips.MailingID
        and outmail_.Type_ = 'list'
        and lyrActiveRecips.NextAttempt > '2004-01-20 00:00:00'
        )
          +
        ( select count(*) from lyrCompletedRecips, members_ a, outmail_
        where a.MemberID_ = lyrCompletedRecips.MemberID
        and a.UserNameLC_ = members_.UserNameLC_
        and a.Domain_ = members_.Domain_
        and outmail_.MessageID_ = lyrCompletedRecips.MailingID
        and outmail_.Type_ = 'list'
        and lyrCompletedRecips.FinalAttempt > '2004-01-20 00:00:00'
        and lyrCompletedRecips.CompletionStatusID = 300 )
          = 3
    )
;
                                                                             QUERY PLAN
                                             

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=537.06..537.07 rows=1 width=72) (actual
time=114460.908..114460.908 rows=0 loops=1)
   ->  Sort  (cost=537.06..537.06 rows=1 width=72) (actual
time=114460.905..114460.905 rows=0 loops=1)
         Sort Key: emailaddr_, memberid_
         ->  Index Scan using ix_members_list_notifyerr on members_
(cost=0.00..537.05 rows=1 width=72) (actual time=114460.893..114460.893
rows=0 loops=1)
               Index Cond: ((list_)::text = 'list1'::text)
               Filter: (((membertype_)::text = 'normal'::text) AND
((subtype_)::text = 'mail'::text) AND (emailaddr_ IS NOT NULL) AND
(((subplan) + (subplan)) = 3))
               SubPlan
                 ->  Aggregate  (cost=52.39..52.39 rows=1 width=0)
(actual time=0.089..0.090 rows=1 loops=818122)
                       ->  Hash Join  (cost=47.55..52.39 rows=1 width=0)
(actual time=0.086..0.086 rows=0 loops=818122)
                             Hash Cond: ("outer".memberid_ =
"inner".memberid)
                             ->  Index Scan using ix_members_emaillc on
members_ a  (cost=0.00..4.83 rows=1 width=4) (actual time=0.077..0.081
rows=1 loops=818122)
                                   Index Cond: (((domain_)::text =
($2)::text) AND ((usernamelc_)::text = ($1)::text))
                             ->  Hash  (cost=47.55..47.55 rows=1
width=4) (actual time=0.025..0.025 rows=0 loops=1)
                                   ->  Hash Join  (cost=25.00..47.55
rows=1 width=4) (actual time=0.023..0.023 rows=0 loops=1)
                                         Hash Cond: ("outer".messageid_
= "inner".mailingid)
                                         ->  Seq Scan on outmail_
(cost=0.00..22.50 rows=6 width=4) (actual time=0.001..0.001 rows=0
loops=1)
                                               Filter: ((type_)::text =
'list'::text)
                                         ->  Hash  (cost=25.00..25.00
rows=2 width=8) (actual time=0.003..0.003 rows=0 loops=1)
                                               ->  Seq Scan on
lyrcompletedrecips  (cost=0.00..25.00 rows=2 width=8) (actual
time=0.001..0.001 rows=0 loops=1)
                                                     Filter:
((finalattempt > '2004-01-20 00:00:00'::timestamp without time zone) AND
(completionstatusid = 300))
                 ->  Aggregate  (cost=51.59..51.59 rows=1 width=0)
(actual time=0.033..0.034 rows=1 loops=818122)
                       ->  Hash Join  (cost=27.35..51.59 rows=1 width=0)
(actual time=0.028..0.028 rows=0 loops=818122)
                             Hash Cond: ((("outer".username)::text =
("inner".usernamelc_)::text) AND (("outer"."domain")::text =
("inner".domain_)::text))
                             ->  Hash Join  (cost=22.52..46.72 rows=3
width=211) (actual time=0.003..0.003 rows=0 loops=818122)
                                   Hash Cond: ("outer".mailingid =
"inner".messageid_)
                                   ->  Seq Scan on lyractiverecips
(cost=0.00..22.50 rows=334 width=215) (actual time=0.001..0.001 rows=0
loops=818122)
                                         Filter: (nextattempt >
'2004-01-20 00:00:00'::timestamp without time zone)
                                   ->  Hash  (cost=22.50..22.50 rows=6
width=4) (actual time=0.003..0.003 rows=0 loops=1)
                                         ->  Seq Scan on outmail_
(cost=0.00..22.50 rows=6 width=4) (actual time=0.002..0.002 rows=0
loops=1)
                                               Filter: ((type_)::text =
'list'::text)
                             ->  Hash  (cost=4.82..4.82 rows=2
width=211) (actual time=0.017..0.017 rows=0 loops=818122)
                                   ->  Index Scan using pk_members_ on
members_ a  (cost=0.00..4.82 rows=2 width=211) (actual time=0.011..0.013
rows=1 loops=818122)
                                         Index Cond: (memberid_ = $0)
 Total runtime: 114474.407 ms
(34 rows)

that's with no data in lyractiverecips or lyrcompletedrecips. With data
in those tables, the query still hasn't completed after several hours on
two different machines.

thanks,
--
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, jack@lyris.com
"Interoperability is the keyword, uniformity is a dead end."
                --Olivier Fourdan



pgsql-performance by date:

Previous
From: Nick Barr
Date:
Subject: Re: Linux / Clariion
Next
From: Tom Lane
Date:
Subject: Re: query optimization question