Thread: index and seq scan
Hello List, i have the following table with an index on appid: dbl=# \d test Table "test" Attribute | Type | Modifier -----------+--------------------------+------------------------------------------------- id | integer | not null default nextval('"test_id_seq"'::text) time | timestamp with time zone | appid | integer | Indices: appid_idx, test_id_key dbl=# \d appid_idx Index "appid_idx" Attribute | Type -----------+--------- appid | integer btree Ii want this index to be used in my query, but only the seq. scan is used and i don't know why. When changing the WHERE expression to 'appid < 10', the index is used (see EXPLAIN command below). Could someone please explain this behavior to me and how i can use the index in the first query? dbl=# explain select * from test where appid > 10; NOTICE: QUERY PLAN: Seq Scan on test (cost=0.00..1530.84 rows=81050 width=16) EXPLAIN db=# explain select * from test where appid > 10; NOTICE: QUERY PLAN: Index Scan using appid_idx on test (cost=0.00..70.20 rows=57 width=16) EXPLAIN regards Tina
In the first query explain thinks it is going to match 81050 rows and the planner thinks that this is a large enough number that a sequential scan would be a win. In the second query (which I am guessing you mis-pasted as it still says 'appid > 10') explain thinks it will only be returning 57 rows, and so it uses a index scan. If these values don't look like they are close to reality, then you need to vacuum analyze your table. Otherwise you need to see if the first query would actually be faster using an index scan. Type: SET enable_seqscan TO off; at the psql prompt, and then run your query again. I know from my own experience that sometimes PostgreSQL way too conservative about using index scans. Fortunately it has gotten quite a bit better, and 7.2 promises to be better still. And if worse comes to worse, it is always possible to force PostgreSQL to use an index scan. Jason Tina Messmann <tina.messmann@xinux.de> writes: > Hello List, > > i have the following table with an index on appid: > > dbl=# \d test > Table "test" > Attribute | Type | Modifier > > -----------+--------------------------+------------------------------------------------- > id | integer | not null default > nextval('"test_id_seq"'::text) > time | timestamp with time zone | > appid | integer | > Indices: appid_idx, > test_id_key > > dbl=# \d appid_idx > Index "appid_idx" > Attribute | Type > -----------+--------- > appid | integer > btree > > Ii want this index to be used in my query, but only the seq. scan is > used and i don't know why. > When changing the WHERE expression to 'appid < 10', the index is used > (see EXPLAIN command below). > Could someone please explain this behavior to me and how i can use the > index in the first query? > > dbl=# explain select * from test where appid > 10; > NOTICE: QUERY PLAN: > Seq Scan on test (cost=0.00..1530.84 rows=81050 width=16) > EXPLAIN > > db=# explain select * from test where appid > 10; > NOTICE: QUERY PLAN: > Index Scan using appid_idx on test (cost=0.00..70.20 rows=57 width=16) > EXPLAIN > > regards > Tina > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html
On Thu, 10 Jan 2002, Tina Messmann wrote: > Hello List, > > i have the following table with an index on appid: > > dbl=# \d test > Table "test" > Attribute | Type | Modifier > > -----------+--------------------------+------------------------------------------------- > id | integer | not null default > nextval('"test_id_seq"'::text) > time | timestamp with time zone | > appid | integer | > Indices: appid_idx, > test_id_key > > dbl=# \d appid_idx > Index "appid_idx" > Attribute | Type > -----------+--------- > appid | integer > btree > > Ii want this index to be used in my query, but only the seq. scan is > used and i don't know why. > When changing the WHERE expression to 'appid < 10', the index is used > (see EXPLAIN command below). > Could someone please explain this behavior to me and how i can use the > index in the first query? Have you run vacuum analyze on this table? It looks to me that the estimates seem to say almost all of the rows have appid>10. In general if the estimates are correct (over 99% of the table returned) the index scan is likely to be more expensive in io than the sequence scan. If you run the query with enable_seqscan set to false and without, what are the runtimes? > > dbl=# explain select * from test where appid > 10; > NOTICE: QUERY PLAN: > Seq Scan on test (cost=0.00..1530.84 rows=81050 width=16) > EXPLAIN > > db=# explain select * from test where appid > 10; > NOTICE: QUERY PLAN: > Index Scan using appid_idx on test (cost=0.00..70.20 rows=57 width=16) > EXPLAIN > > regards > Tina > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html >
Tina Messmann <tina.messmann@xinux.de> writes: > [ appid < 10 uses an index, appid > 10 does not ] This is not wrong. An indexscan only wins over a sequential scan if the indexscan can scan a fairly small percentage of the table --- the critical percentage varies, but let's say it's 5%. If appid < 10 selects less than 5% of the table, then appid > 10 must select more than 95% of the table. Accordingly, choosing a seqscan for the second query is the right thing to do. regards, tom lane
Hello, uups, the second query should be 'appid < 10', sorry for this. vaccum analyze was performed on the table just before the queries were executed. The estimations are correct: db=# select count(*) from test where appid > 10; count ------- 81023 (1 row) The runtime is not the problem in this case (with seq scan about 1sec, after enabling seq scan and using index scan about 1sec), i was just wondering why Postgres is using seq scan on 'appid > 10' and index scan on 'appid < 10'. Thank you all for the answers. regards Tina