Interesting behaviour ! - Mailing list pgsql-hackers
From | Constantin Teodorescu |
---|---|
Subject | Interesting behaviour ! |
Date | |
Msg-id | 378C394C.3487DBAC@flex.ro Whole thread Raw |
Responses |
Re: [HACKERS] Interesting behaviour !
|
List | pgsql-hackers |
Please Cc: to teo@flex.ro I have a table called "note" that looks like this : create table "note" (id serial,perioada int2,schema int2,explicatie text,...); The "note" table has 22.000 records and the record length is about 75 bytes (is has also a "text" field"). Because I am frequently accesing the table with queries like "... where perioada=12" I was tempted to make also indexes on "perioada" and "schema" field. The tables have the following sizes (their file sizes into /usr/local/pgsql/data/base....) note 2.890 Kb note_id 385 Kb note_perioada 409 Kb note_schema 466 Kb I ran previusly "vacuum analyze" on that database ensuring that statistical tables have been updated. Trying some selects with explain I got the following results: contabil=> explain select * from note where id=15; NOTICE: QUERY PLAN: Index Scan using note_id on note (cost=2.05 rows=2 width=87) contabil=> explain select * from note where perioada=15; NOTICE: QUERY PLAN: Seq Scan on note (cost=1099.99 rows=1600 width=87) contabil=> explain select * from note where schema=15; NOTICE: QUERY PLAN: Seq Scan on note (cost=1099.99 rows=432 width=87) That means that searching on "perioada" field don't use "note_perioada" index!!! I know that the query optimisation take care of record lengths, table sizes, index sizes, but I thought that in this case it will use "note_perioada" index. The distribution of "perioada" values within "note" records is like that : contabil=> select perioada,count(*) from note group by perioada; perioada|count --------+----- 4| 2 7| 66 8| 108 9| 135 10| 151 11| 146 12| 4468 13| 3045 14| 3377 15| 3207 16| 3100 17| 3039 18| 1789 19| 1 22| 2 (15 rows) So, I think that PostgreSQL is doing right when he chooses not to use "note_perioada" index for that type of query by comparing different costs (althought it still remains strange at the first look). Is there any chance to speed up that type of simple query (select * from note where perioada=N) ? I dropped the index and try with a "hash" index on the same "perioada" field. The same result. In this case, it seems that the "note_perioada" index will never be used. That means it can be safely dropped without affecting the application performance, isn't it? It is expected that the database will grow in the same manner, with approx. the same nr. of records per "perioada" field every month. Best regards, Please Cc: to teo@flex.ro =============================== Constantin Teodorescu FLEX Consulting Braila, ROMANIA
pgsql-hackers by date: