Thread: is this a bug or do I not understand the query planner?

is this a bug or do I not understand the query planner?

From
Rob Prowel
Date:
two almost identical queries: one searches for

read in ('N','n')

and the other searches for

read in ('Y','y').

the (explain) SQL statement says that one uses the
index on the (read) field and the other does a
sequential table scan.  Why!!!????  I can think of no
logical reason for this behavior.



usenet=# \d+ article
                            Table "public.article"
 Column  |          Type          |          Modifiers
      | Description
---------+------------------------+------------------------------+-------------
 msg     | integer                | not null
          |
 thedate | date                   | not null
          |
 subject | character varying(300) |
          |
 lines   | integer                | not null default 0
          |
 read    | character(1)           | not null default
'N'::bpchar |
 ng      | integer                | not null default 0
          |
 author  | integer                | not null default 0
          |
Indexes:
    "article_pkey" PRIMARY KEY, btree (msg)
    "article_read" btree ("read")
Has OIDs: yes

usenet=# explain select * from article where read in
('Y','y');
                                         QUERY PLAN

--------------------------------------------------------------------------------------------
 Index Scan using article_read, article_read on
article  (cost=0.00..4.03 rows=1 width=107)
   Index Cond: (("read" = 'Y'::bpchar) OR ("read" =
'y'::bpchar))
(2 rows)

usenet=# explain select * from article where read in
('N','n');
                            QUERY PLAN

-------------------------------------------------------------------
 Seq Scan on article  (cost=0.00..68661.02
rows=2018135 width=107)
   Filter: (("read" = 'N'::bpchar) OR ("read" =
'n'::bpchar))
(2 rows)





__________________________________
Yahoo! FareChase: Search multiple travel sites in one click.
http://farechase.yahoo.com

Re: is this a bug or do I not understand the query planner?

From
"Qingqing Zhou"
Date:
"Rob Prowel" <tempest766@yahoo.com> wrote
> two almost identical queries: one searches for
>
> read in ('N','n')
>
> and the other searches for
>
> read in ('Y','y').
>
> the (explain) SQL statement says that one uses the
> index on the (read) field and the other does a
> sequential table scan.  Why!!!????

With big chance, it is not a bug. As your explain indicates, Yy query is
estimated only return 1 row while Nn query is estimated to return 2018135
rows. So for the latter, compared to your table size(I guess), an seqscan is
more suitable.

Try "explain analyze" to see if the query optimizer gets a right guess. If
not, run "vacuum full analyze", then check again. If still not, you may want
to post the results here.

Regards,
Qingqing

Re: is this a bug or do I not understand the query planner?

From
Peter Eisentraut
Date:
Rob Prowel wrote:
> two almost identical queries: one searches for
> read in ('N','n')
> and the other searches for
> read in ('Y','y').
>
> the (explain) SQL statement says that one uses the
> index on the (read) field and the other does a
> sequential table scan.  Why!!!????  I can think of no
> logical reason for this behavior.

Imagine a table with one million 'Y' and one 'N'.  Searching for the
first should not use an index, searching for the second should.  A
similar case probably applies here.  I hope that is logical enough for
you. :)

In general, if you disagree with a plan choice, you should provide
evidence that the chosen plan is in practice worse then the one you
would have preferred.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/