Thread: Index doesn't appear to be working.

Index doesn't appear to be working.

From
"John Oakes"
Date:
I have a table with a column titled 'passfail' that only contains either a P
for pass or an F for fail.  The table name is 'one'.  I created the index on
the table with:

CREATE INDEX one_passfail_idx ON one USING btree (passfail);

I then do:

VACUUM ANALYZE one;

Then I do an explain on this query:

SELECT * FROM one where passfail = 'P';

and it tells me:

Seq Scan on one (cost=0.00..263.02 rows=5613 width=56)

Shouldn't it tell me

Index Scan using one_passfail_idx on one?

Why isn't it using the index?  This query returns about 5,600 of 10,000
records.  It is faster to just do a seq scan because it returns such a high
percentage or records?  Thanks, I appreciate the help!

John Oakes





Re: Index doesn't appear to be working.

From
Stephan Szabo
Date:
On Sat, 2 Mar 2002, John Oakes wrote:

> I have a table with a column titled 'passfail' that only contains either a P
> for pass or an F for fail.  The table name is 'one'.  I created the index on
> the table with:
>
> CREATE INDEX one_passfail_idx ON one USING btree (passfail);
>
> I then do:
>
> VACUUM ANALYZE one;
>
> Then I do an explain on this query:
>
> SELECT * FROM one where passfail = 'P';
>
> and it tells me:
>
> Seq Scan on one (cost=0.00..263.02 rows=5613 width=56)
>
> Shouldn't it tell me
>
> Index Scan using one_passfail_idx on one?
>
> Why isn't it using the index?  This query returns about 5,600 of 10,000
> records.  It is faster to just do a seq scan because it returns such a high
> percentage or records?  Thanks, I appreciate the help!

Usually the seq scan will be faster in this kind of situation.  Because
the system needs to load the matching rows anyway, it's likely to result
in reading all the blocks of the table and paying a penalty for seeking
around the file.




Re: Index doesn't appear to be working.

From
Bruce Momjian
Date:
> > Why isn't it using the index?  This query returns about 5,600 of 10,000
> > records.  It is faster to just do a seq scan because it returns such a high
> > percentage or records?  Thanks, I appreciate the help!
> 
> Usually the seq scan will be faster in this kind of situation.  Because
> the system needs to load the matching rows anyway, it's likely to result
> in reading all the blocks of the table and paying a penalty for seeking
> around the file.

We are getting this "no index use" question over and over again, and I
was wondering why the FAQ item was not answering this question for
people.  It turns out the wording was not very clear and I have updated
this FAQ item to more clearly answer the question.  Seems I kept adding
to that item without restructuring the information:

   <H4><A name="4.8">4.8</A>) My queries are slow or don't make use of   the indexes. Why?</H4>
   Indexes are not automatically used by every query. Indexes are only   used if the table is larger than a minimum
size,and the index   selects only a small percentage of the rows in the table. This is   because the random disk access
causedby an index scan is sometimes   slower than a straight read through the table, or sequential scan.      <P>To
determineif an index should be used, PostgreSQL must have   statistics about the table. These statistics are collected
using  <SMALL>VACUUM ANALYZE</SMALL>, or simply <SMALL>ANALYZE</SMALL>.   Using statistics, the optimizer knows how
manyrows are in the   table, and can better determine if indexes should be used.   Statistics are also valuable in
determiningoptimal join order and   join methods. Statistics collection should be performed periodically   as the
contentsof the table change.
 
   <P>Indexes are normally not used for <SMALL>ORDER BY</SMALL> or to   perform joins. A sequential scan followed by an
explicitsort is   usually faster than an index scan of a large table.</P> However,   <SMALL>LIMIT</SMALL> combined with
<SMALL>ORDERBY</SMALL> often   will use an index because only a small portion of the table is   returned.
 
   <P>When using wild-card operators such as <SMALL>LIKE</SMALL> or   <I>~</I>, indexes can only be used if the
beginningof the search is   anchored to the start of the string. Therefore, to use indexes,   <SMALL>LIKE</SMALL>
patternsmust not start with <I>%</I>, and   <I>~</I>(regular expression) patterns must start with <I>^</I>.</P>
 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Index doesn't appear to be working.

From
"Wei Weng"
Date:
Is there any plan of making Postgresql collecting statistics a little more
automatically?


-----Original Message-----
From: pgsql-sql-owner@postgresql.org
[mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Bruce Momjian
Sent: Sunday, March 03, 2002 11:03 AM
To: Stephan Szabo
Cc: John Oakes; pgsql-sql@postgresql.org
Subject: Re: [SQL] Index doesn't appear to be working.


> > Why isn't it using the index?  This query returns about 5,600 of 10,000
> > records.  It is faster to just do a seq scan because it returns such a
high
> > percentage or records?  Thanks, I appreciate the help!
>
> Usually the seq scan will be faster in this kind of situation.  Because
> the system needs to load the matching rows anyway, it's likely to result
> in reading all the blocks of the table and paying a penalty for seeking
> around the file.

We are getting this "no index use" question over and over again, and I
was wondering why the FAQ item was not answering this question for
people.  It turns out the wording was not very clear and I have updated
this FAQ item to more clearly answer the question.  Seems I kept adding
to that item without restructuring the information:

   <H4><A name="4.8">4.8</A>) My queries are slow or don't make use of   the indexes. Why?</H4>
   Indexes are not automatically used by every query. Indexes are only   used if the table is larger than a minimum
size,and the index   selects only a small percentage of the rows in the table. This is   because the random disk access
causedby an index scan is sometimes   slower than a straight read through the table, or sequential scan.
 
   <P>To determine if an index should be used, PostgreSQL must have   statistics about the table. These statistics are
collectedusing   <SMALL>VACUUM ANALYZE</SMALL>, or simply <SMALL>ANALYZE</SMALL>.   Using statistics, the optimizer
knowshow many rows are in the   table, and can better determine if indexes should be used.   Statistics are also
valuablein determining optimal join order and   join methods. Statistics collection should be performed periodically
asthe contents of the table change.
 
   <P>Indexes are normally not used for <SMALL>ORDER BY</SMALL> or to   perform joins. A sequential scan followed by an
explicitsort is   usually faster than an index scan of a large table.</P> However,   <SMALL>LIMIT</SMALL> combined with
<SMALL>ORDERBY</SMALL> often   will use an index because only a small portion of the table is   returned.
 
   <P>When using wild-card operators such as <SMALL>LIKE</SMALL> or   <I>~</I>, indexes can only be used if the
beginningof the search is   anchored to the start of the string. Therefore, to use indexes,   <SMALL>LIKE</SMALL>
patternsmust not start with <I>%</I>, and   <I>~</I>(regular expression) patterns must start with <I>^</I>.</P>
 

-- Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command   (send "unregister YourEmailAddressHere" to
majordomo@postgresql.org)



Re: Index doesn't appear to be working.

From
Bruce Momjian
Date:
> Is there any plan of making Postgresql collecting statistics a little
> more automatically?

Well, most of the commerical OS's require some manual intervention.  I
don't know how we could do any better than we are doing now, but we are
open to suggestions.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Index doesn't appear to be working.

From
Frank Bax
Date:
At 11:02 AM 03/03/02 -0500, Bruce Momjian wrote:
>     Indexes are not automatically used by every query. Indexes are only
>     used if the table is larger than a minimum size, and the index
>     selects only a small percentage of the rows in the table.


Shouldn't it say:        "and the *query* selects only a small percentage..."



Re: Index doesn't appear to be working.

From
Bruce Momjian
Date:
Frank Bax wrote:
> At 11:02 AM 03/03/02 -0500, Bruce Momjian wrote:
> >     Indexes are not automatically used by every query. Indexes are only
> >     used if the table is larger than a minimum size, and the index
> >     selects only a small percentage of the rows in the table.
> 
> 
> Shouldn't it say:
>          "and the *query* selects only a small percentage..."

Yes, I meant "index returns only a small percentage", but query selects
is better.  Thanks.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026