Thread: 8.1 planner problem ?

8.1 planner problem ?

From
Gianluca Alberici
Date:
Hello,

I have ran into an interesting problem with 8.1 and i would like anybody
to explain me if there's a problem with the planner or there's a problem
with myself. In both cases a solution is welcome. The following query:

SELECT sum(qty) FROM
    _abi_main_pof_r ampr
    inner join _abi_main_items ami on ampr.itemid=ami.id
    inner join _abi_main_pof_t ampt on ampr.poftid=ampt.poftid
    inner join _abi_main_contacts amc on ampt.contactid=amc.codest
WHERE
    ampt.doctypeid in ('BCA','BSC')
    and amc.descr ilike '%SOCO%'
    and ampr.type=0;

is explain analyze'd as follows

Aggregate  (cost=1220.65..1220.66 rows=1 width=4) (actual
time=95937.824..95937.824 rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..1220.65 rows=1 width=4) (actual
time=3695.250..95936.292 rows=1503 loops=1)
         ->  Nested Loop  (cost=0.00..1214.64 rows=1 width=8) (actual
time=3695.229..95924.697 rows=1503 loops=1)
               Join Filter: ("inner".poftid = "outer".poftid)
               ->  Nested Loop  (cost=0.00..79.16 rows=1 width=4)
(actual time=0.039..22.547 rows=2437 loops=1)
                     Join Filter: ("inner".contactid = "outer".codest)
                     ->  Seq Scan on _abi_main_contacts amc
(cost=0.00..1.29 rows=1 width=4) (actual time=0.029..0.034 rows=1 loops=1)
                           Filter: ((descr)::text ~~* '%SOCO%'::text)
                     ->  Seq Scan on _abi_main_pof_t ampt
(cost=0.00..77.53 rows=27 width=8) (actual time=0.006..15.820 rows=2702
loops=1)
                           Filter: (((doctypeid)::text = 'BCA'::text) OR
((doctypeid)::text = 'BSC'::text))
               ->  Seq Scan on _abi_main_pof_r ampr  (cost=0.00..1132.81
rows=214 width=12) (actual time=0.034..35.986 rows=8271 loops=2437)
                     Filter: ("type" = 0)
         ->  Index Scan using _abi_docks_items_pkey on _abi_main_items
ami  (cost=0.00..5.99 rows=1 width=4) (actual time=0.005..0.005 rows=1
loops=1503)
               Index Cond: ("outer".itemid = ami.id)
Total runtime: 95937.950 ms

...The same query, but with a condition change as ampr.type != 1 instead
of ampr.type=0

SELECT sum(qty) FROM
    _abi_main_pof_r ampr
    inner join _abi_main_items ami on ampr.itemid=ami.id
    inner join _abi_main_pof_t ampt on ampr.poftid=ampt.poftid
    inner join _abi_main_contacts amc on ampt.contactid=amc.codest
WHERE
    ampt.doctypeid in ('BCA','BSC')
    and amc.descr ilike '%SOCO%'
    and ampr.type != 1;


is explain analyze'd as follows:

Aggregate  (cost=1446.13..1446.14 rows=1 width=4) (actual
time=81.609..81.609 rows=1 loops=1)
   ->  Nested Loop  (cost=77.60..1446.12 rows=2 width=4) (actual
time=22.597..80.944 rows=1503 loops=1)
         ->  Nested Loop  (cost=77.60..1434.12 rows=2 width=8) (actual
time=22.577..72.785 rows=1503 loops=1)
               Join Filter: ("inner".contactid = "outer".codest)
               ->  Seq Scan on _abi_main_contacts amc  (cost=0.00..1.29
rows=1 width=4) (actual time=0.030..0.036 rows=1 loops=1)
                     Filter: ((descr)::text ~~* '%SOCO%'::text)
               ->  Hash Join  (cost=77.60..1427.52 rows=425 width=12)
(actual time=22.536..69.034 rows=8271 loops=1)
                     Hash Cond: ("outer".poftid = "inner".poftid)
                     ->  Seq Scan on _abi_main_pof_r ampr
(cost=0.00..1132.81 rows=42571 width=12) (actual time=0.035..37.045
rows=8271 loops=1)
                           Filter: ("type" <> 1)
                     ->  Hash  (cost=77.53..77.53 rows=27 width=8)
(actual time=22.471..22.471 rows=2702 loops=1)
                           ->  Seq Scan on _abi_main_pof_t ampt
(cost=0.00..77.53 rows=27 width=8) (actual time=0.006..20.482 rows=2702
loops=1)
                                 Filter: (((doctypeid)::text =
'BCA'::text) OR ((doctypeid)::text = 'BSC'::text))
         ->  Index Scan using _abi_docks_items_pkey on _abi_main_items
ami  (cost=0.00..5.99 rows=1 width=4) (actual time=0.003..0.004 rows=1
loops=1503)
               Index Cond: ("outer".itemid = ami.id)
Total runtime: 81.735 ms



The evidence is that with the first condition the query planner seems to
choose a 'lower cost' solution which lead to an execution time of 95
seconds (!!!) but in the second 'higher cost' solution the time is 85
milliseconds. The result set in this case is exactly the same. It is
evident to me that there must be a problem with this.

Anybody ?

TIA

Gianluca



Re: 8.1 planner problem ?

From
Heikki Linnakangas
Date:
Gianluca Alberici wrote:
> I have ran into an interesting problem with 8.1 and i would like anybody
> to explain me if there's a problem with the planner or there's a problem
> with myself. In both cases a solution is welcome. The following query:
>
> SELECT sum(qty) FROM
>     _abi_main_pof_r ampr
>     inner join _abi_main_items ami on ampr.itemid=ami.id
>     inner join _abi_main_pof_t ampt on ampr.poftid=ampt.poftid
>     inner join _abi_main_contacts amc on ampt.contactid=amc.codest
> WHERE
>     ampt.doctypeid in ('BCA','BSC')
>     and amc.descr ilike '%SOCO%'
>     and ampr.type=0;
>
> is explain analyze'd as follows
>
> Aggregate  (cost=1220.65..1220.66 rows=1 width=4) (actual
> time=95937.824..95937.824 rows=1 loops=1)
>   ->  Nested Loop  (cost=0.00..1220.65 rows=1 width=4) (actual
> time=3695.250..95936.292 rows=1503 loops=1)
>         ->  Nested Loop  (cost=0.00..1214.64 rows=1 width=8) (actual
> time=3695.229..95924.697 rows=1503 loops=1)
>               Join Filter: ("inner".poftid = "outer".poftid)
>               ->  Nested Loop  (cost=0.00..79.16 rows=1 width=4) (actual
> time=0.039..22.547 rows=2437 loops=1)
>                     Join Filter: ("inner".contactid = "outer".codest)
>                     ->  Seq Scan on _abi_main_contacts amc
> (cost=0.00..1.29 rows=1 width=4) (actual time=0.029..0.034 rows=1 loops=1)
>                           Filter: ((descr)::text ~~* '%SOCO%'::text)
>                     ->  Seq Scan on _abi_main_pof_t ampt
> (cost=0.00..77.53 rows=27 width=8) (actual time=0.006..15.820 rows=2702
> loops=1)
>                           Filter: (((doctypeid)::text = 'BCA'::text) OR
> ((doctypeid)::text = 'BSC'::text))
>               ->  Seq Scan on _abi_main_pof_r ampr  (cost=0.00..1132.81
> rows=214 width=12) (actual time=0.034..35.986 rows=8271 loops=2437)
>                     Filter: ("type" = 0)

The problem is in the estimate of that seq scan of _abi_main_pof_r
table. The planner estimates that there's 214 rows with type=0, but in
reality there's 8271. That misestimate throws off the rest of the plan.

> ...The same query, but with a condition change as ampr.type != 1 instead
> of ampr.type=0
>
> SELECT sum(qty) FROM
>     _abi_main_pof_r ampr
>     inner join _abi_main_items ami on ampr.itemid=ami.id
>     inner join _abi_main_pof_t ampt on ampr.poftid=ampt.poftid
>     inner join _abi_main_contacts amc on ampt.contactid=amc.codest
> WHERE
>     ampt.doctypeid in ('BCA','BSC')
>     and amc.descr ilike '%SOCO%'
>     and ampr.type != 1;
>
>
> is explain analyze'd as follows:
>
> Aggregate  (cost=1446.13..1446.14 rows=1 width=4) (actual
> time=81.609..81.609 rows=1 loops=1)
>   ->  Nested Loop  (cost=77.60..1446.12 rows=2 width=4) (actual
> time=22.597..80.944 rows=1503 loops=1)
>         ->  Nested Loop  (cost=77.60..1434.12 rows=2 width=8) (actual
> time=22.577..72.785 rows=1503 loops=1)
>               Join Filter: ("inner".contactid = "outer".codest)
>               ->  Seq Scan on _abi_main_contacts amc  (cost=0.00..1.29
> rows=1 width=4) (actual time=0.030..0.036 rows=1 loops=1)
>                     Filter: ((descr)::text ~~* '%SOCO%'::text)
>               ->  Hash Join  (cost=77.60..1427.52 rows=425 width=12)
> (actual time=22.536..69.034 rows=8271 loops=1)
>                     Hash Cond: ("outer".poftid = "inner".poftid)
>                     ->  Seq Scan on _abi_main_pof_r ampr
> (cost=0.00..1132.81 rows=42571 width=12) (actual time=0.035..37.045
> rows=8271 loops=1)
>                           Filter: ("type" <> 1)

Here you can see that the planner is estimating that there's 42571 rows
with type <> 1, which is also way off the reality in the other
direction, but this time the plan it chooses happens to be better.

Run ANALYZE. The planner needs to have statistics on how many rows of
each type there is to come up with the proper plan.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com