Indexes not always used after inserts/updates/vacuum analyze - Mailing list pgsql-bugs
From | Michael G. Martin |
---|---|
Subject | Indexes not always used after inserts/updates/vacuum analyze |
Date | |
Msg-id | 3C7D9BB9.4060006@vpmonline.com Whole thread Raw |
Responses |
Re: Indexes not always used after inserts/updates/vacuum analyze
|
List | pgsql-bugs |
I recently upgraded to 7.2 from 7.1. Prior to 7,2, I was shutting down the database, droping indexes, vacuuming analayze, re-building all the indexes on a nightly basis ( all automated of course ;) ). Things ran fine. After upgrading to 7.2, I replaced all that with a nightly on-line /usr/local/pgsql/bin/vacuumdb -v -z on the database. The problems I am seeing is this. As new data is added and modified, indexes are not being used with some queries, but work fine on others. Even a vacuum full analyze did not fix the problem. I had to drop and re-build the index for the query to use the index. Now, I see the problem starting again. The table has about 20-30 million rows ( 4-5Gigs in size), so seq scan kills any access. This email is lengthy, but I wanted to detail this well. Here is the table definition: Table "symbol_data" Column | Type | Modifiers --------------------+-----------------------+----------- symbol_name | character varying(10) | not null date | date | not null open | numeric(15,3) | high | numeric(15,3) | low | numeric(15,3) | d_close | numeric(15,3) | volume | numeric(15,0) | earnings | numeric(15,3) | dividend | numeric(15,3) | source | character varying(50) | daily_mp_12a_long | character(3) | weekly_mp_12a_long | character(3) | daily_mp_32a_long | character(3) | weekly_mp_32a_long | character(3) | Indexes: symbol_data_date_indx Unique keys: symbol_data_pkey ------------------------------------------------------ Index "symbol_data_date_indx" Column | Type --------+------ date | date btree ------------------------------------------------------ Index "symbol_data_pkey" Column | Type -------------+----------------------- symbol_name | character varying(10) date | date unique btree ---------------------------------------------------------- Here is what I would expect which usually happens: explain select * from symbol_data where symbol_name='IBM'; Index Scan using symbol_data_pkey on symbol_data (cost=0.00..512.99 rows=128 width=129) Here is one that fails: explain select * from symbol_data where symbol_name='ELTE'; Seq Scan on symbol_data (cost=0.00..707415.32 rows=438015 width=129) Now I thought maybe it had something to do with the concatenated primary key, but: explain select * from symbol_data where symbol_name='IBM' and date between '1990-01-01' and '2002-01-01'; Index Scan using symbol_data_pkey on symbol_data (cost=0.00..402.55 rows=100 width=129) explain select * from symbol_data where symbol_name='ELTE' and date between '1990-01-01' and '2002-01-01'; Seq Scan on symbol_data (cost=0.00..810075.06 rows=342903 width=129) Now, changing the date range will eventually use the index: explain select * from symbol_data where symbol_name='ELTE' and date between '2002-01-01' and '2002-02-01'; NOTICE: QUERY PLAN: Index Scan using symbol_data_pkey on symbol_data (cost=0.00..10815.42 rows=2706 width=129) ...now I do a vacuum analyze: VACUUM verbose ANALYZE symbol_data; NOTICE: --Relation symbol_data-- NOTICE: Index symbol_data_date_indx: Pages 49709; Tuples 20536054: Deleted 4221. CPU 4.35s/16.30u sec elapsed 45.33 sec. NOTICE: Index symbol_data_pkey: Pages 74029; Tuples 20536054: Deleted 4221. CPU 6.44s/15.15u sec elapsed 31.00 sec. NOTICE: Removed 4221 tuples in 70 pages. CPU 0.00s/0.04u sec elapsed 0.08 sec. NOTICE: Pages 450911: Changed 0, Empty 0; Tup 20536054: Vac 4221, Keep 0, UnUsed 1858963. Total CPU 49.20s/36.31u sec elapsed 149.00 sec. NOTICE: Analyzing symbol_data VACUUM explain select * from symbol_data where symbol_name='ELTE' and date between '1990-01-01' and '2002-01-01'; Seq Scan on symbol_data (cost=0.00..810291.94 rows=292916 width=129) No change on the name. A new index: create index test on symbol_data (symbol_name); explain select * from symbol_data where symbol_name='ELTE'; NOTICE: QUERY PLAN: Seq Scan on symbol_data (cost=0.00..707611.68 rows=383340 width=129) EXPLAIN vpm=> VACUUM verbose ANALYZE symbol_data; NOTICE: --Relation symbol_data-- NOTICE: Pages 450911: Changed 0, Empty 0; Tup 20536054: Vac 0, Keep 0, UnUsed 1863184. Total CPU 38.99s/4.50u sec elapsed 67.95 sec. NOTICE: Analyzing symbol_data VACUUM vpm=> explain select * from symbol_data where symbol_name='ELTE'; NOTICE: QUERY PLAN: Seq Scan on symbol_data (cost=0.00..707611.68 rows=355958 width=129) Doesnt works. I think the only way to make this ever work is to drop the indexes, vacuum full, and rebuild. Any thoughts? This is a very dynamic table, but I was hoping the online vacuum in 7.2 would fix the problems. I'm in the process of splitting this table up into smaller pieces which will make life easier anyway, but I think there is something going on here. Thanks, Michael
pgsql-bugs by date: