Bug in the planner? - Mailing list pgsql-performance

From Claudio Freire
Subject Bug in the planner?
Date
Msg-id AANLkTi=Bjzq6TKeh9Ev001CSf1613Mn4VpL6ghVxcY9M@mail.gmail.com
Whole thread Raw
Responses Re: Bug in the planner?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
This is postgresql 9.0.3:


Query:

select
    sum(stat_responses) * 100.0 / sum(stat_invites) as stat_response_rate,
    sum(real_responses) * 100.0 / sum(real_invites) as real_response_rate
from (
    select
        ms.invites as stat_invites,
        (select count(*) from invites i join deliveries d on d.id = i.delivery_id
            where i.member_id = ms.member_id
            and d.recontact_number = 0
            and d.delivery_type = 1) as real_invites,
        ms.responses as stat_responses,
        (select count(*) from track_logs tl join tracks t on t.id = tl.track_id
            where t.member_id = ms.member_id
            and t.partner_id is null and t.recontact_number = 0 and
t.contact_method_id = 1
            and t.delivery_type = 1
            and tl.track_status_id = 10) as real_responses
    from member_statistics ms
    join livra_users lu on lu.id = ms.member_id
    where lu.country_id = 2 and lu.is_panelist and lu.confirmed and not
lu.unregistered
) as rtab;



                                                        QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=224382.22..225969.27 rows=1 width=12)
   ->  Hash Join  (cost=88355.09..221837.46 rows=254475 width=12)
         Hash Cond: (ms.member_id = lu.id)
         ->  Seq Scan on member_statistics ms  (cost=0.00..99539.50
rows=2511850 width=12)
         ->  Hash  (cost=85174.15..85174.15 rows=254475 width=4)
               ->  Bitmap Heap Scan on livra_users lu
(cost=14391.40..85174.15 rows=254475 width=4)
                     Recheck Cond: (country_id = 2)
                     Filter: (is_panelist AND confirmed AND (NOT unregistered))
                     ->  Bitmap Index Scan on ix_user_state
(cost=0.00..14327.78 rows=763100 width=0)
                           Index Cond: (country_id = 2)
   SubPlan 1
     ->  Aggregate  (cost=181.25..181.26 rows=1 width=0)
           ->  Nested Loop  (cost=0.00..181.19 rows=24 width=0)
                 ->  Index Scan using idx_tracks_partner_id_member_id
on tracks t  (cost=0.00..49.83 rows=9 width=8)
                       Index Cond: ((partner_id IS NULL) AND (member_id = $0))
                       Filter: ((recontact_number = 0) AND
(contact_method_id = 1) AND (delivery_type = 1))
                 ->  Index Scan using idx_track_logs_track_id on
track_logs tl  (cost=0.00..14.56 rows=3 width=8)
                       Index Cond: (tl.track_id = t.id)
                       Filter: (tl.track_status_id = 10)
   SubPlan 2
     ->  Aggregate  (cost=1405.75..1405.76 rows=1 width=0)
           ->  Nested Loop  (cost=0.00..1405.45 rows=119 width=0)
                 ->  Index Scan using
idx_invites_member_id_delivery_id on invites i  (cost=0.00..431.03
rows=119 width=4)
                       Index Cond: (member_id = $0)
                 ->  Index Scan using deliveries_pkey on deliveries d
(cost=0.00..8.18 rows=1 width=4)
                       Index Cond: (d.id = i.delivery_id)
                       Filter: ((d.recontact_number = 0) AND
(d.delivery_type = 1))
(27 rows)


If you inspect the plan, it's not computing the total expected cost correctly.

The top "Aggregate" node acts as a nested loop on SubPlan 1 & 2, but
it's only adding the cost of the subplans without regard as to how
many iterations it will perform (254475)

Explain analyze didn't end in an hour of runtime, running on a Core2
with 4G RAM.

It's not a big issue to me, I can work around it, but it should
perhaps be looked into.

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: unexpected stable function behavior
Next
From: Tom Lane
Date:
Subject: Re: Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1