Thread: cpu_tuple_cost WRONG -> After an "analyze" or "vacuum full" indexes are not used anymore!!!

Hi all,
I did a little experiment ( Postgres 7.2.1 ):

I inserted some rows (28896) in an table with random value beetwen 0 and 100
and after same explain I noticed that the index created is not used after
a vacuum analyze ( some times after a vacuum full ) I raise the
cpu_tuple_cost
and the behaviour seem correct, this what I did:


kalman=# create table to_del (col1 serial primary key, col2 integer);
NOTICE:  CREATE TABLE will create implicit sequence 'to_del_col1_seq' for
SERIAL column 'to_del.col1'
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index 'to_del_pkey'
for table 'to_del'
CREATE
kalman=# insert into to_del(col2) (select (random()*100)::integer);
NOTICE:  to_del_col1_seq.nextval: sequence was re-created
INSERT 5008207 1
kalman=# insert into to_del(col2) (select (random()*100)::integer from
to_del);
INSERT 5008208 1
kalman=# insert into to_del(col2) (select (random()*100)::integer from
to_del t1, to_del t2);
INSERT 0 4
kalman=# insert into to_del(col2) (select (random()*100)::integer from
to_del t1, to_del t2);
INSERT 0 36
kalman=# insert into to_del(col2) (select (random()*100)::integer from
to_del t1, to_del t2);
INSERT 0 1764
kalman=# insert into to_del(col2) (select (random()*100)::integer from
to_del t1);
INSERT 0 1806
kalman=# insert into to_del(col2) (select (random()*100)::integer from
to_del t1);
INSERT 0 3612
kalman=# insert into to_del(col2) (select (random()*100)::integer from
to_del t1);
INSERT 0 7224
kalman=# insert into to_del(col2) (select (random()*100)::integer from
to_del t1);
INSERT 0 14448
kalman=# select count(*) from to_del;
 count
-------
 28896
(1 row)

kalman=# select count(*) from to_del where col2 = 30;
 count
-------
   283
(1 row)

kalman=# explain select * from to_del where col2 = 30;
NOTICE:  QUERY PLAN:

Seq Scan on to_del  (cost=0.00..22.50 rows=5 width=8)

EXPLAIN
kalman=# create index idx_col2 on to_del(col2);
CREATE
kalman=# explain select * from to_del where col2 = 30;
NOTICE:  QUERY PLAN:

Index Scan using idx_col2 on to_del  (cost=0.00..399.75 rows=144 width=8)

EXPLAIN
kalman=# analyze;
ANALYZE
kalman=# explain select * from to_del where col2 = 30;
NOTICE:  QUERY PLAN:

Seq Scan on to_del  (cost=0.00..520.06 rows=277 width=8)

EXPLAIN
kalman=# vacuum full;
VACUUM
kalman=# explain select * from to_del where col2 = 30;
NOTICE:  QUERY PLAN:

Seq Scan on to_del  (cost=0.00..518.20 rows=275 width=8)

EXPLAIN
kalman=# analyze;
ANALYZE
kalman=# explain select * from to_del where col2 = 30;
NOTICE:  QUERY PLAN:

Seq Scan on to_del  (cost=0.00..520.06 rows=407 width=8)

EXPLAIN

So it's really strange. I continued with this consideration:

For a seq scan the cost is 520.06

kalman=# set enable_seqscan=off;
SET VARIABLE
kalman=# explain select * from to_del where col2 = 30;
NOTICE:  QUERY PLAN:

Index Scan using idx_col2 on to_del  (cost=0.00..634.38 rows=407 width=8)

so for an index scan is 634.38

kalman=# show cpu_tuple_cost;
NOTICE:  cpu_tuple_cost is 0.01
SHOW VARIABLE

kalman=# show cpu_index_tuple_cost ;
NOTICE:  cpu_index_tuple_cost is 0.001
SHOW VARIABLE

So I tried to raise the cost of cpu_tuple_cost:

kalman=#  set cpu_tuple_cost = 0.1;
SET VARIABLE

and now all seems right:

kalman=# explain select * from to_del where col2 = 30;
NOTICE:  QUERY PLAN:

Index Scan using idx_col2 on to_del  (cost=0.00..670.97 rows=407 width=8)

now I'll check if the scan is used again:

kalman=# update to_del set col2 = 30;
UPDATE 28896

kalman=# explain select * from to_del where col2 = 30;
NOTICE:  QUERY PLAN:

Index Scan using idx_col2 on to_del  (cost=0.00..670.97 rows=407 width=8)

and after an analyze, correctly I obtain:

kalman=# analyze;
ANALYZE
kalman=# explain select * from to_del where col2 = 30;
NOTICE:  QUERY PLAN:

Seq Scan on to_del  (cost=0.00..858.40 rows=5321 width=8)

EXPLAIN



Some time the optimizer is right to use the seq  scan instead of
an index I tried to insert 10^6 rows and the time performance
are completly different: 6 seconds with cpu_tuple_cost = 0.01
and 0.6 seconds with cpu_tuple_cost = 0.1


I'm going to set the parameter cpu_tuple_cost = 0.1 to
my server, I'm wrong ?




PS: in an Postgres 7.1.3 box the behaviour is correct:
the index is used also after an analyze;



Ciao
Gaetano

--
#exclude <windows>
#include <CSRSS>
printf("\t\t\b\b\b\b\b\b");.
printf("\t\t\b\b\b\b\b\b");


"Gaetano Mendola" <mendola@bigfoot.com> writes:
> I'm going to set the parameter cpu_tuple_cost = 0.1 to
> my server, I'm wrong ?

This seems very excessively high.  1.0 corresponds to one disk read,
which is in the several-milliseconds range on most modern systems.
You're essentially claiming that your CPU requires about a millisecond
for per-tuple overhead, which is way off base (unless you're using
an 8086 or some such...)  My feeling is that the default value of 0.01
is already on the high side, and getting more so as CPUs get faster
compared to disks.

It's always possible to force the optimizer to choose "the right answer"
in one particular case by pushing out some parameter setting far beyond
the reasonable range, but I think you'll find that this answer breaks
more stuff than it fixes.

A saner way of tweaking the index-vs-seqscan costs is to reduce
random_page_cost a little --- the default is 4.0 which may be on the
high side.  (But don't push it below 1.0.)

            regards, tom lane

From: "Tom Lane" <tgl@sss.pgh.pa.us>
> A saner way of tweaking the index-vs-seqscan costs is to reduce
> random_page_cost a little --- the default is 4.0 which may be on the
> high side.  (But don't push it below 1.0.)


So I tried:

# explain select * from to_del where col2 = 30;
NOTICE:  QUERY PLAN:

Seq Scan on to_del  (cost=0.00..8281.25 rows=4443 width=8)


for obtain an index scan I set random_page_cost = 2.8;
with 2.9 the choose was still the sequenzial scan.

# set random_page_cost = 2.8;
SET VARIABLE
#  explain select * from to_del where col2 = 30;
NOTICE:  QUERY PLAN:

Index Scan using idx_col2 on to_del  (cost=0.00..7953.95 rows=4443 width=8)

EXPLAIN


4.0 -> 2.8 is not so a little or not ?

Ciao
Gaetano


From: "Tom Lane" <tgl@sss.pgh.pa.us>
> A saner way of tweaking the index-vs-seqscan costs is to reduce
> random_page_cost a little --- the default is 4.0 which may be on the
> high side.  (But don't push it below 1.0.)


So I tried:

# explain select * from to_del where col2 = 30;
NOTICE:  QUERY PLAN:

Seq Scan on to_del  (cost=0.00..8281.25 rows=4443 width=8)


for obtain an index scan I set random_page_cost = 2.8;
with 2.9 the choose was still the sequenzial scan.

# set random_page_cost = 2.8;
SET VARIABLE
#  explain select * from to_del where col2 = 30;
NOTICE:  QUERY PLAN:

Index Scan using idx_col2 on to_del  (cost=0.00..7953.95 rows=4443 width=8)

EXPLAIN


4.0 -> 2.8 is not so a little or not ?

Ciao
Gaetano