> If there are many identical values in att0, are you sure a sequential
> scan isn't more efficient? Also, are you sure the index isn't working
> well? It seems to me since you have the table clustered, it might be
> fairly efficient as-is (it would get a huge benefit from the spatial
> locality of the tuples in the table). Index size alone shouldn't destroy
> your performance, since the idea of an index lookup is that it only has
> to read O(log n) pages from the disk per lookup.
In the next example, have in mind that:
select relname, relpages, reltuples from pg_class;
relname | relpages | reltuples
--------------------------------+----------+-------------
...
tc2000000000 | 142858 | 1.00001e+06
inst_id_idx | 2745 | 1e+06
...
and that i run postgresql, on a UltraSPARC[tm] III 600MHz, ram: 512MB
OS : sol 9
att0: varchar(1000)
att1: int4
and that 0<=att1>=900000000 for every tuple of tabe and index.
query:
select att0 from tc2000000000 where att1=900000000 AND att1>=0
plan:Index Scan using inst_id_idx on tc2000000000 (cost=0.00..161603.06
rows=1000006 width=1004) (actual time=41.21..101917.36 rows=1000000 loops=1) Index Cond: ((att1 <= 900000000) AND
(att1>= 0))Total runtime: 103135.03 msec
query:
select att0 from tc2000000000
plan:Seq Scan on tc2000000000 (cost=100000000.00..100152858.06 rows=1000006
width=1004) (actual time=0.21..42584.87 rows=1000000 loops=1)Total runtime: 43770.73 msec
Can you explain me this big difference? Perhaps postgresql caches in
memory a big part (or the whole) of index?
And by the way why postgresql doesn't select sequential scan? (I have
done vacuum analyze).