Weird, bad 0.5% selectivity estimate for a column equal to itself - Mailing list pgsql-performance

From Josh Berkus
Subject Weird, bad 0.5% selectivity estimate for a column equal to itself
Date
Msg-id 51C4AEE5.8040704@agliodbs.com
Whole thread Raw
Responses Re: Weird, bad 0.5% selectivity estimate for a column equal to itself
List pgsql-performance
Folks,

I'm getting something really odd in 9.2.4, where the planner estimates
that the selectivity of a column equal to itself is always exactly 0.5%
(i.e. 0.005X).  I can't figure out where this constant is coming from,
or why it's being applied.

Test case:

create table esttest (
    id int not null primary key,
    state1 int not null default 0,
    state2 int not null default 0,
    state3 int not null default 0
);

insert into esttest (id, state1, state2, state3)
select i,
    (random()*3)::int,
    (random())::int,
    (random()*100)::int
from generate_series (1, 20000)
    as gs(i);

vacuum analyze esttest;

explain analyze
select * from esttest
where state1 = state1;

explain analyze
select * from esttest
where state2 = state2;

explain analyze
select * from esttest
where state3 = state3;

Results of test case:

badestimate=# explain analyze
badestimate-# select * from esttest
badestimate-# where state1 = state1;
                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------
 Seq Scan on esttest  (cost=0.00..359.00 rows=100 width=16) (actual
time=0.009..4.145 rows=20000 loops=1)
   Filter: (state1 = state1)
 Total runtime: 5.572 ms
(3 rows)

badestimate=#
badestimate=# explain analyze
badestimate-# select * from esttest
badestimate-# where state2 = state2;
                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------
 Seq Scan on esttest  (cost=0.00..359.00 rows=100 width=16) (actual
time=0.006..4.166 rows=20000 loops=1)
   Filter: (state2 = state2)
 Total runtime: 5.595 ms
(3 rows)

badestimate=#
badestimate=# explain analyze
badestimate-# select * from esttest
badestimate-# where state3 = state3;
                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------
 Seq Scan on esttest  (cost=0.00..359.00 rows=100 width=16) (actual
time=0.005..4.298 rows=20000 loops=1)
   Filter: (state3 = state3)
 Total runtime: 5.716 ms
(3 rows)




--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


pgsql-performance by date:

Previous
From: Maciek Sakrejda
Date:
Subject: Re: Query tuning: partitioning, DISTINCT ON, and indexing
Next
From: Tom Lane
Date:
Subject: Re: Weird, bad 0.5% selectivity estimate for a column equal to itself