Thread: cpu_tuple_cost WRONG -> After an "analyze" or "vacuum full" indexes are not used anymore!!!
cpu_tuple_cost WRONG -> After an "analyze" or "vacuum full" indexes are not used anymore!!!
From
"Gaetano Mendola"
Date:
Hi all, I did a little experiment ( Postgres 7.2.1 ): I inserted some rows (28896) in an table with random value beetwen 0 and 100 and after same explain I noticed that the index created is not used after a vacuum analyze ( some times after a vacuum full ) I raise the cpu_tuple_cost and the behaviour seem correct, this what I did: kalman=# create table to_del (col1 serial primary key, col2 integer); NOTICE: CREATE TABLE will create implicit sequence 'to_del_col1_seq' for SERIAL column 'to_del.col1' NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'to_del_pkey' for table 'to_del' CREATE kalman=# insert into to_del(col2) (select (random()*100)::integer); NOTICE: to_del_col1_seq.nextval: sequence was re-created INSERT 5008207 1 kalman=# insert into to_del(col2) (select (random()*100)::integer from to_del); INSERT 5008208 1 kalman=# insert into to_del(col2) (select (random()*100)::integer from to_del t1, to_del t2); INSERT 0 4 kalman=# insert into to_del(col2) (select (random()*100)::integer from to_del t1, to_del t2); INSERT 0 36 kalman=# insert into to_del(col2) (select (random()*100)::integer from to_del t1, to_del t2); INSERT 0 1764 kalman=# insert into to_del(col2) (select (random()*100)::integer from to_del t1); INSERT 0 1806 kalman=# insert into to_del(col2) (select (random()*100)::integer from to_del t1); INSERT 0 3612 kalman=# insert into to_del(col2) (select (random()*100)::integer from to_del t1); INSERT 0 7224 kalman=# insert into to_del(col2) (select (random()*100)::integer from to_del t1); INSERT 0 14448 kalman=# select count(*) from to_del; count ------- 28896 (1 row) kalman=# select count(*) from to_del where col2 = 30; count ------- 283 (1 row) kalman=# explain select * from to_del where col2 = 30; NOTICE: QUERY PLAN: Seq Scan on to_del (cost=0.00..22.50 rows=5 width=8) EXPLAIN kalman=# create index idx_col2 on to_del(col2); CREATE kalman=# explain select * from to_del where col2 = 30; NOTICE: QUERY PLAN: Index Scan using idx_col2 on to_del (cost=0.00..399.75 rows=144 width=8) EXPLAIN kalman=# analyze; ANALYZE kalman=# explain select * from to_del where col2 = 30; NOTICE: QUERY PLAN: Seq Scan on to_del (cost=0.00..520.06 rows=277 width=8) EXPLAIN kalman=# vacuum full; VACUUM kalman=# explain select * from to_del where col2 = 30; NOTICE: QUERY PLAN: Seq Scan on to_del (cost=0.00..518.20 rows=275 width=8) EXPLAIN kalman=# analyze; ANALYZE kalman=# explain select * from to_del where col2 = 30; NOTICE: QUERY PLAN: Seq Scan on to_del (cost=0.00..520.06 rows=407 width=8) EXPLAIN So it's really strange. I continued with this consideration: For a seq scan the cost is 520.06 kalman=# set enable_seqscan=off; SET VARIABLE kalman=# explain select * from to_del where col2 = 30; NOTICE: QUERY PLAN: Index Scan using idx_col2 on to_del (cost=0.00..634.38 rows=407 width=8) so for an index scan is 634.38 kalman=# show cpu_tuple_cost; NOTICE: cpu_tuple_cost is 0.01 SHOW VARIABLE kalman=# show cpu_index_tuple_cost ; NOTICE: cpu_index_tuple_cost is 0.001 SHOW VARIABLE So I tried to raise the cost of cpu_tuple_cost: kalman=# set cpu_tuple_cost = 0.1; SET VARIABLE and now all seems right: kalman=# explain select * from to_del where col2 = 30; NOTICE: QUERY PLAN: Index Scan using idx_col2 on to_del (cost=0.00..670.97 rows=407 width=8) now I'll check if the scan is used again: kalman=# update to_del set col2 = 30; UPDATE 28896 kalman=# explain select * from to_del where col2 = 30; NOTICE: QUERY PLAN: Index Scan using idx_col2 on to_del (cost=0.00..670.97 rows=407 width=8) and after an analyze, correctly I obtain: kalman=# analyze; ANALYZE kalman=# explain select * from to_del where col2 = 30; NOTICE: QUERY PLAN: Seq Scan on to_del (cost=0.00..858.40 rows=5321 width=8) EXPLAIN Some time the optimizer is right to use the seq scan instead of an index I tried to insert 10^6 rows and the time performance are completly different: 6 seconds with cpu_tuple_cost = 0.01 and 0.6 seconds with cpu_tuple_cost = 0.1 I'm going to set the parameter cpu_tuple_cost = 0.1 to my server, I'm wrong ? PS: in an Postgres 7.1.3 box the behaviour is correct: the index is used also after an analyze; Ciao Gaetano -- #exclude <windows> #include <CSRSS> printf("\t\t\b\b\b\b\b\b");. printf("\t\t\b\b\b\b\b\b");
Re: cpu_tuple_cost WRONG -> After an "analyze" or "vacuum full" indexes are not used anymore!!!
From
Tom Lane
Date:
"Gaetano Mendola" <mendola@bigfoot.com> writes: > I'm going to set the parameter cpu_tuple_cost = 0.1 to > my server, I'm wrong ? This seems very excessively high. 1.0 corresponds to one disk read, which is in the several-milliseconds range on most modern systems. You're essentially claiming that your CPU requires about a millisecond for per-tuple overhead, which is way off base (unless you're using an 8086 or some such...) My feeling is that the default value of 0.01 is already on the high side, and getting more so as CPUs get faster compared to disks. It's always possible to force the optimizer to choose "the right answer" in one particular case by pushing out some parameter setting far beyond the reasonable range, but I think you'll find that this answer breaks more stuff than it fixes. A saner way of tweaking the index-vs-seqscan costs is to reduce random_page_cost a little --- the default is 4.0 which may be on the high side. (But don't push it below 1.0.) regards, tom lane
Re: cpu_tuple_cost WRONG -> After an "analyze" or "vacuum full" indexes are not used anymore!!!
From
"Gaetano Mendola"
Date:
From: "Tom Lane" <tgl@sss.pgh.pa.us> > A saner way of tweaking the index-vs-seqscan costs is to reduce > random_page_cost a little --- the default is 4.0 which may be on the > high side. (But don't push it below 1.0.) So I tried: # explain select * from to_del where col2 = 30; NOTICE: QUERY PLAN: Seq Scan on to_del (cost=0.00..8281.25 rows=4443 width=8) for obtain an index scan I set random_page_cost = 2.8; with 2.9 the choose was still the sequenzial scan. # set random_page_cost = 2.8; SET VARIABLE # explain select * from to_del where col2 = 30; NOTICE: QUERY PLAN: Index Scan using idx_col2 on to_del (cost=0.00..7953.95 rows=4443 width=8) EXPLAIN 4.0 -> 2.8 is not so a little or not ? Ciao Gaetano
Re: cpu_tuple_cost WRONG -> After an "analyze" or "vacuum full" indexes are not used anymore!!!
From
"Gaetano Mendola"
Date:
From: "Tom Lane" <tgl@sss.pgh.pa.us> > A saner way of tweaking the index-vs-seqscan costs is to reduce > random_page_cost a little --- the default is 4.0 which may be on the > high side. (But don't push it below 1.0.) So I tried: # explain select * from to_del where col2 = 30; NOTICE: QUERY PLAN: Seq Scan on to_del (cost=0.00..8281.25 rows=4443 width=8) for obtain an index scan I set random_page_cost = 2.8; with 2.9 the choose was still the sequenzial scan. # set random_page_cost = 2.8; SET VARIABLE # explain select * from to_del where col2 = 30; NOTICE: QUERY PLAN: Index Scan using idx_col2 on to_del (cost=0.00..7953.95 rows=4443 width=8) EXPLAIN 4.0 -> 2.8 is not so a little or not ? Ciao Gaetano