Re: Optimizer seems to be way off, why? - Mailing list pgsql-performance

From Richard Huxton
Subject Re: Optimizer seems to be way off, why?
Date
Msg-id 42DE8375.8020205@archonet.com
Whole thread Raw
In response to Optimizer seems to be way off, why?  (Dirk Lutzebäck <lutzeb@aeccom.com>)
Responses Re: Optimizer seems to be way off, why?  (Dirk Lutzebäck <lutzeb@aeccom.com>)
List pgsql-performance
Dirk Lutzebäck wrote:
> Hi,
>
> I do not under stand the following explain output (pgsql 8.0.3):
>
> explain analyze
> select b.e from b, d
> where b.r=516081780 and b.c=513652057 and b.e=d.e;
>
>                                                         QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------
>
> Nested Loop  (cost=0.00..1220.09 rows=1 width=4) (actual
> time=0.213..2926.845 rows=324503 loops=1)
>   ->  Index Scan using b_index on b  (cost=0.00..1199.12 rows=1 width=4)
> (actual time=0.104..17.418 rows=3293 loops=1)
>         Index Cond: (r = 516081780::oid)
>         Filter: (c = 513652057::oid)
>   ->  Index Scan using d_e_index on d  (cost=0.00..19.22 rows=140
> width=4) (actual time=0.009..0.380 rows=99 loops=3293)
>         Index Cond: ("outer".e = d.e)
> Total runtime: 3638.783 ms
> (7 rows)
>
> Why is the rows estimate for b_index and the nested loop 1? It is
> actually 3293 and 324503.

I'm guessing (and that's all it is) that b.r and b.c have a higher
correlation than the planner is expecting. That is, it expects the
b.c=... to reduce the number of matching rows much more than it is.

Try a query just on WHERE b.r=516081780 and see if it gets the estimate
right for that.

If it's a common query, it might be worth an index on (r,c)

--
   Richard Huxton
   Archonet Ltd


pgsql-performance by date:

Previous
From: Christopher Petrilli
Date:
Subject: Re: Impact of checkpoint_segments under continual load conditions
Next
From: Dirk Lutzebäck
Date:
Subject: Re: Optimizer seems to be way off, why?