Thread: Bug in the planner?

Bug in the planner?

From
Claudio Freire
Date:
This is postgresql 9.0.3:


Query:

select
    sum(stat_responses) * 100.0 / sum(stat_invites) as stat_response_rate,
    sum(real_responses) * 100.0 / sum(real_invites) as real_response_rate
from (
    select
        ms.invites as stat_invites,
        (select count(*) from invites i join deliveries d on d.id = i.delivery_id
            where i.member_id = ms.member_id
            and d.recontact_number = 0
            and d.delivery_type = 1) as real_invites,
        ms.responses as stat_responses,
        (select count(*) from track_logs tl join tracks t on t.id = tl.track_id
            where t.member_id = ms.member_id
            and t.partner_id is null and t.recontact_number = 0 and
t.contact_method_id = 1
            and t.delivery_type = 1
            and tl.track_status_id = 10) as real_responses
    from member_statistics ms
    join livra_users lu on lu.id = ms.member_id
    where lu.country_id = 2 and lu.is_panelist and lu.confirmed and not
lu.unregistered
) as rtab;



                                                        QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=224382.22..225969.27 rows=1 width=12)
   ->  Hash Join  (cost=88355.09..221837.46 rows=254475 width=12)
         Hash Cond: (ms.member_id = lu.id)
         ->  Seq Scan on member_statistics ms  (cost=0.00..99539.50
rows=2511850 width=12)
         ->  Hash  (cost=85174.15..85174.15 rows=254475 width=4)
               ->  Bitmap Heap Scan on livra_users lu
(cost=14391.40..85174.15 rows=254475 width=4)
                     Recheck Cond: (country_id = 2)
                     Filter: (is_panelist AND confirmed AND (NOT unregistered))
                     ->  Bitmap Index Scan on ix_user_state
(cost=0.00..14327.78 rows=763100 width=0)
                           Index Cond: (country_id = 2)
   SubPlan 1
     ->  Aggregate  (cost=181.25..181.26 rows=1 width=0)
           ->  Nested Loop  (cost=0.00..181.19 rows=24 width=0)
                 ->  Index Scan using idx_tracks_partner_id_member_id
on tracks t  (cost=0.00..49.83 rows=9 width=8)
                       Index Cond: ((partner_id IS NULL) AND (member_id = $0))
                       Filter: ((recontact_number = 0) AND
(contact_method_id = 1) AND (delivery_type = 1))
                 ->  Index Scan using idx_track_logs_track_id on
track_logs tl  (cost=0.00..14.56 rows=3 width=8)
                       Index Cond: (tl.track_id = t.id)
                       Filter: (tl.track_status_id = 10)
   SubPlan 2
     ->  Aggregate  (cost=1405.75..1405.76 rows=1 width=0)
           ->  Nested Loop  (cost=0.00..1405.45 rows=119 width=0)
                 ->  Index Scan using
idx_invites_member_id_delivery_id on invites i  (cost=0.00..431.03
rows=119 width=4)
                       Index Cond: (member_id = $0)
                 ->  Index Scan using deliveries_pkey on deliveries d
(cost=0.00..8.18 rows=1 width=4)
                       Index Cond: (d.id = i.delivery_id)
                       Filter: ((d.recontact_number = 0) AND
(d.delivery_type = 1))
(27 rows)


If you inspect the plan, it's not computing the total expected cost correctly.

The top "Aggregate" node acts as a nested loop on SubPlan 1 & 2, but
it's only adding the cost of the subplans without regard as to how
many iterations it will perform (254475)

Explain analyze didn't end in an hour of runtime, running on a Core2
with 4G RAM.

It's not a big issue to me, I can work around it, but it should
perhaps be looked into.

Re: Bug in the planner?

From
Tom Lane
Date:
Claudio Freire <klaussfreire@gmail.com> writes:
> This is postgresql 9.0.3:
> Query:

> select
>     sum(stat_responses) * 100.0 / sum(stat_invites) as stat_response_rate,
>     sum(real_responses) * 100.0 / sum(real_invites) as real_response_rate
> from (
>     select
>         ms.invites as stat_invites,
>         (select count(*) from invites i join deliveries d on d.id = i.delivery_id
>             where i.member_id = ms.member_id
>             and d.recontact_number = 0
>             and d.delivery_type = 1) as real_invites,
>         ms.responses as stat_responses,
>         (select count(*) from track_logs tl join tracks t on t.id = tl.track_id
>             where t.member_id = ms.member_id
>             and t.partner_id is null and t.recontact_number = 0 and
> t.contact_method_id = 1
>             and t.delivery_type = 1
>             and tl.track_status_id = 10) as real_responses
>     from member_statistics ms
>     join livra_users lu on lu.id = ms.member_id
>     where lu.country_id = 2 and lu.is_panelist and lu.confirmed and not
> lu.unregistered
> ) as rtab;

> The top "Aggregate" node acts as a nested loop on SubPlan 1 & 2, but
> it's only adding the cost of the subplans without regard as to how
> many iterations it will perform (254475)

Hmm, interesting.  The reason is that it's computing the cost of the
output SELECT list on the basis of the number of times that select list
will be evaluated, ie, once.  But the aggregate function argument
expressions will be evaluated more times than that.  Most of the time an
aggregate is applied to something trivial like a Var reference, so
nobody's noticed that the cost of its input expression is underestimated.

> Explain analyze didn't end in an hour of runtime, running on a Core2
> with 4G RAM.

A better estimate isn't going to make that go any faster :-(.

            regards, tom lane