Folks,
I'm getting something really odd in 9.2.4, where the planner estimates
that the selectivity of a column equal to itself is always exactly 0.5%
(i.e. 0.005X). I can't figure out where this constant is coming from,
or why it's being applied.
Test case:
create table esttest (
id int not null primary key,
state1 int not null default 0,
state2 int not null default 0,
state3 int not null default 0
);
insert into esttest (id, state1, state2, state3)
select i,
(random()*3)::int,
(random())::int,
(random()*100)::int
from generate_series (1, 20000)
as gs(i);
vacuum analyze esttest;
explain analyze
select * from esttest
where state1 = state1;
explain analyze
select * from esttest
where state2 = state2;
explain analyze
select * from esttest
where state3 = state3;
Results of test case:
badestimate=# explain analyze
badestimate-# select * from esttest
badestimate-# where state1 = state1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Seq Scan on esttest (cost=0.00..359.00 rows=100 width=16) (actual
time=0.009..4.145 rows=20000 loops=1)
Filter: (state1 = state1)
Total runtime: 5.572 ms
(3 rows)
badestimate=#
badestimate=# explain analyze
badestimate-# select * from esttest
badestimate-# where state2 = state2;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Seq Scan on esttest (cost=0.00..359.00 rows=100 width=16) (actual
time=0.006..4.166 rows=20000 loops=1)
Filter: (state2 = state2)
Total runtime: 5.595 ms
(3 rows)
badestimate=#
badestimate=# explain analyze
badestimate-# select * from esttest
badestimate-# where state3 = state3;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Seq Scan on esttest (cost=0.00..359.00 rows=100 width=16) (actual
time=0.005..4.298 rows=20000 loops=1)
Filter: (state3 = state3)
Total runtime: 5.716 ms
(3 rows)
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com