Thread: Optimizer too eager to choose full table scans

Optimizer too eager to choose full table scans

From
Greg Stark
Date:
Here's the same query with slightly different paramers. The optimizer chooses
to use a full table scan for the one set of parameters. Even though it's only
10% more records the query takes 4 times as long to execute, presumably
because of the full table scan. (Yes, I ran these several times to reduce disk
caching effects.)

Is there a parameter to adjust to tilt the scales somewhat back in balance
here? It seems to be overeager to use full table scans.


slo=> explain analyze select count(*) from ad_dept where dept_id between 730 and 738;
                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=5417.18..5417.18 rows=1 width=0) (actual time=875.74..875.74 rows=1 loops=1)
   ->  Seq Scan on ad_dept  (cost=0.00..5409.53 rows=3059 width=0) (actual time=0.18..869.45 rows=2767 loops=1)
         Filter: ((dept_id >= 730) AND (dept_id <= 738))
 Total runtime: 877.81 msec
(4 rows)

Time: 879.80 ms
slo=> explain analyze select count(*) from ad_dept where dept_id between 731 and 738;
                                                             QUERY PLAN
             

-------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=5236.00..5236.00 rows=1 width=0) (actual time=232.25..232.25 rows=1 loops=1)
   ->  Index Scan using ad_dept_dept on ad_dept  (cost=0.00..5229.06 rows=2778 width=0) (actual time=0.40..214.76
rows=2477loops=1) 
         Index Cond: ((dept_id >= 731) AND (dept_id <= 738))
 Total runtime: 232.42 msec
(4 rows)

Time: 237.53 ms


--
greg

Re: Optimizer too eager to choose full table scans

From
Joseph Shraibman
Date:
See RANDOM_PAGE_COST under
http://developer.postgresql.org/docs/postgres/runtime-config.html#RUNTIME-CONFIG-OPTIMIZER


Greg Stark wrote:
> Here's the same query with slightly different paramers. The optimizer chooses
> to use a full table scan for the one set of parameters. Even though it's only
> 10% more records the query takes 4 times as long to execute, presumably
> because of the full table scan. (Yes, I ran these several times to reduce disk
> caching effects.)
>
> Is there a parameter to adjust to tilt the scales somewhat back in balance
> here? It seems to be overeager to use full table scans.
>
>
> slo=> explain analyze select count(*) from ad_dept where dept_id between 730 and 738;
>                                                    QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=5417.18..5417.18 rows=1 width=0) (actual time=875.74..875.74 rows=1 loops=1)
>    ->  Seq Scan on ad_dept  (cost=0.00..5409.53 rows=3059 width=0) (actual time=0.18..869.45 rows=2767 loops=1)
>          Filter: ((dept_id >= 730) AND (dept_id <= 738))
>  Total runtime: 877.81 msec
> (4 rows)
>
> Time: 879.80 ms
> slo=> explain analyze select count(*) from ad_dept where dept_id between 731 and 738;
>                                                              QUERY PLAN
               
>
-------------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=5236.00..5236.00 rows=1 width=0) (actual time=232.25..232.25 rows=1 loops=1)
>    ->  Index Scan using ad_dept_dept on ad_dept  (cost=0.00..5229.06 rows=2778 width=0) (actual time=0.40..214.76
rows=2477loops=1) 
>          Index Cond: ((dept_id >= 731) AND (dept_id <= 738))
>  Total runtime: 232.42 msec
> (4 rows)
>
> Time: 237.53 ms
>
>