Thread: is this a bug or do I not understand the query planner?
two almost identical queries: one searches for read in ('N','n') and the other searches for read in ('Y','y'). the (explain) SQL statement says that one uses the index on the (read) field and the other does a sequential table scan. Why!!!???? I can think of no logical reason for this behavior. usenet=# \d+ article Table "public.article" Column | Type | Modifiers | Description ---------+------------------------+------------------------------+------------- msg | integer | not null | thedate | date | not null | subject | character varying(300) | | lines | integer | not null default 0 | read | character(1) | not null default 'N'::bpchar | ng | integer | not null default 0 | author | integer | not null default 0 | Indexes: "article_pkey" PRIMARY KEY, btree (msg) "article_read" btree ("read") Has OIDs: yes usenet=# explain select * from article where read in ('Y','y'); QUERY PLAN -------------------------------------------------------------------------------------------- Index Scan using article_read, article_read on article (cost=0.00..4.03 rows=1 width=107) Index Cond: (("read" = 'Y'::bpchar) OR ("read" = 'y'::bpchar)) (2 rows) usenet=# explain select * from article where read in ('N','n'); QUERY PLAN ------------------------------------------------------------------- Seq Scan on article (cost=0.00..68661.02 rows=2018135 width=107) Filter: (("read" = 'N'::bpchar) OR ("read" = 'n'::bpchar)) (2 rows) __________________________________ Yahoo! FareChase: Search multiple travel sites in one click. http://farechase.yahoo.com
"Rob Prowel" <tempest766@yahoo.com> wrote > two almost identical queries: one searches for > > read in ('N','n') > > and the other searches for > > read in ('Y','y'). > > the (explain) SQL statement says that one uses the > index on the (read) field and the other does a > sequential table scan. Why!!!???? With big chance, it is not a bug. As your explain indicates, Yy query is estimated only return 1 row while Nn query is estimated to return 2018135 rows. So for the latter, compared to your table size(I guess), an seqscan is more suitable. Try "explain analyze" to see if the query optimizer gets a right guess. If not, run "vacuum full analyze", then check again. If still not, you may want to post the results here. Regards, Qingqing
Rob Prowel wrote: > two almost identical queries: one searches for > read in ('N','n') > and the other searches for > read in ('Y','y'). > > the (explain) SQL statement says that one uses the > index on the (read) field and the other does a > sequential table scan. Why!!!???? I can think of no > logical reason for this behavior. Imagine a table with one million 'Y' and one 'N'. Searching for the first should not use an index, searching for the second should. A similar case probably applies here. I hope that is logical enough for you. :) In general, if you disagree with a plan choice, you should provide evidence that the chosen plan is in practice worse then the one you would have preferred. -- Peter Eisentraut http://developer.postgresql.org/~petere/