Thread: Indexes works only on miss

Indexes works only on miss

From
Sebastjan Trepca
Date:
Hi,

I really don't understand this behaviour. I have a table with column "owner" on which I created an index with btree method. The table contains around 3k rows.

Now I run it using EXPLAIN command.

This query has some results:

explain SELECT "Name" FROM test WHERE "Owner"='root'

"Seq Scan on test  (cost=0.00..119.11 rows=263 width=11)"
"  Filter: (("Owner")::text = 'root'::text)"

Query without results:

explain SELECT "Name" FROM test WHERE "Owner"='blah'

"Index Scan using idx_test_owner on test  (cost=0.00..96.56 rows=28 width=11)"
"  Index Cond: (("Owner")::text = 'blah'::text)"

Why is this happening? Is it because of the memory? I'm running on default db settings, version 8.0 and SUSE 10.

Thanks for help, Sebastjan


Re: Indexes works only on miss

From
Tom Lane
Date:
Sebastjan Trepca <trepca@gmail.com> writes:
> "Seq Scan on test  (cost=0.00..119.11 rows=263 width=11)"
> "  Filter: (("Owner")::text = 'root'::text)"

> "Index Scan using idx_test_owner on test  (cost=0.00..96.56 rows=28
> width=11)"
> "  Index Cond: (("Owner")::text = 'blah'::text)"

> Why is this happening?

This is the planner doing what it is supposed to do, ie, decide whether
the index is selective enough to be worth using.

            regards, tom lane

Re: Indexes works only on miss

From
Michael Fuhr
Date:
On Fri, Jan 06, 2006 at 05:42:41PM +0100, Sebastjan Trepca wrote:
> I really don't understand this behaviour. I have a table with column "owner"
> on which I created an index with btree method. The table contains around 3k
> rows.
>
> Now I run it using EXPLAIN command.

Please post the EXPLAIN ANALYZE output -- that'll show how accurate
the planner's estimates are.

Has the table been vacuumed and analyzed lately?

>This query has some results:
>
> explain SELECT "Name" FROM test WHERE "Owner"='root'
>
> "Seq Scan on test  (cost=0.00..119.11 rows=263 width=11)"
> "  Filter: (("Owner")::text = 'root'::text)"

The planner estimates that this query will return 263 rows; apparently
that's enough of the table that the planner thinks a sequence scan
would be faster than an index scan.  An index scan has to hit the
index *and* the table, so it's a fallacy to assume that an index
scan will always be faster.  You can play with enable_seqscan to
see if an index scan would indeed be faster.  For example:

SET enable_seqscan TO off;
EXPLAIN ANALYZE SELECT ...
SET enable_seqscan TO on;
EXPLAIN ANALYZE SELECT ...

Be aware of disk caching when comparing execution times for different
queries: one query might be slower than another not because of a
less efficient plan, but rather because it had to fetch data from
disk and the "faster" query then took advantage of that cached data.
Run each query several times to allow for this.

> Query without results:
>
> explain SELECT "Name" FROM test WHERE "Owner"='blah'
>
> "Index Scan using idx_test_owner on test  (cost=0.00..96.56 rows=28
> width=11)"
> "  Index Cond: (("Owner")::text = 'blah'::text)"

The planner estimates that this query will return 28 rows, which
makes it more likely that an index scan would be faster because
that's a much smaller percentage of the table.

> Why is this happening? Is it because of the memory? I'm running on default
> db settings, version 8.0 and SUSE 10.

You can use various tuning guides to help adjust your settings.
Here are a couple of links:

http://www.powerpostgresql.com/PerfList
http://www.revsys.com/writings/postgresql-performance.html

Configuration settings can lead the planner to favor index scans,
but as I mentioned earlier, and index scan isn't always faster than
a sequential scan.

--
Michael Fuhr

Re: Indexes works only on miss

From
Sebastjan Trepca
Date:
Thank you for exhaustive explanation, this is the output with analyze :

"Seq Scan on test  (cost=0.00..120.67 rows=627 width=11) (actual time=0.018..5.467 rows=621 loops=1)"
"  Filter: (("Owner")::text = 'root'::text)"
"Total runtime: 7.288 ms"

"Index Scan using idx_test_owner on test  (cost=0.00..96.75 rows=28 width=11) (actual time=0.025..0.025 rows=0 loops=1)"
"  Index Cond: (("Owner")::text = 'blah'::text)"
"Total runtime: 0.064 ms"

I tried a search term in query with less rows and it used index scan too. I gues I didn't understand how indexes work.

Thanks for help, Sebastjan


On 1/6/06, Michael Fuhr <mike@fuhr.org> wrote:
On Fri, Jan 06, 2006 at 05:42:41PM +0100, Sebastjan Trepca wrote:
> I really don't understand this behaviour. I have a table with column "owner"
> on which I created an index with btree method. The table contains around 3k
> rows.
>
> Now I run it using EXPLAIN command.

Please post the EXPLAIN ANALYZE output -- that'll show how accurate
the planner's estimates are.

Has the table been vacuumed and analyzed lately?

>This query has some results:
>
> explain SELECT "Name" FROM test WHERE "Owner"='root'
>
> "Seq Scan on test  (cost=0.00..119.11 rows=263 width=11)"
> "  Filter: (("Owner")::text = 'root'::text)"

The planner estimates that this query will return 263 rows; apparently
that's enough of the table that the planner thinks a sequence scan
would be faster than an index scan.  An index scan has to hit the
index *and* the table, so it's a fallacy to assume that an index
scan will always be faster.  You can play with enable_seqscan to
see if an index scan would indeed be faster.  For example:

SET enable_seqscan TO off;
EXPLAIN ANALYZE SELECT ...
SET enable_seqscan TO on;
EXPLAIN ANALYZE SELECT ...

Be aware of disk caching when comparing execution times for different
queries: one query might be slower than another not because of a
less efficient plan, but rather because it had to fetch data from
disk and the "faster" query then took advantage of that cached data.
Run each query several times to allow for this.

> Query without results:
>
> explain SELECT "Name" FROM test WHERE "Owner"='blah'
>
> "Index Scan using idx_test_owner on test  (cost=0.00..96.56 rows=28
> width=11)"
> "  Index Cond: (("Owner")::text = 'blah'::text)"

The planner estimates that this query will return 28 rows, which
makes it more likely that an index scan would be faster because
that's a much smaller percentage of the table.

> Why is this happening? Is it because of the memory? I'm running on default
> db settings, version 8.0 and SUSE 10.

You can use various tuning guides to help adjust your settings.
Here are a couple of links:

http://www.powerpostgresql.com/PerfList
http://www.revsys.com/writings/postgresql-performance.html

Configuration settings can lead the planner to favor index scans,
but as I mentioned earlier, and index scan isn't always faster than
a sequential scan.

--
Michael Fuhr

Re: Indexes works only on miss

From
"Jim C. Nasby"
Date:
You might need to increase the statistics target (for that table or for
the cluster), and/or decrease random_page_cost (most folks find
something between 2 and 3 to perform the best).

On Fri, Jan 06, 2006 at 07:07:54PM +0100, Sebastjan Trepca wrote:
> Thank you for exhaustive explanation, this is the output with analyze :
>
> "Seq Scan on test  (cost=0.00..120.67 rows=627 width=11) (actual time=
> 0.018..5.467 rows=621 loops=1)"
> "  Filter: (("Owner")::text = 'root'::text)"
> "Total runtime: 7.288 ms"
>
> "Index Scan using idx_test_owner on test  (cost=0.00..96.75 rows=28
> width=11) (actual time=0.025..0.025 rows=0 loops=1)"
> "  Index Cond: (("Owner")::text = 'blah'::text)"
> "Total runtime: 0.064 ms"
>
> I tried a search term in query with less rows and it used index scan too. I
> gues I didn't understand how indexes work.
>
> Thanks for help, Sebastjan
>
>
> On 1/6/06, Michael Fuhr <mike@fuhr.org> wrote:
> >
> > On Fri, Jan 06, 2006 at 05:42:41PM +0100, Sebastjan Trepca wrote:
> > > I really don't understand this behaviour. I have a table with column
> > "owner"
> > > on which I created an index with btree method. The table contains around
> > 3k
> > > rows.
> > >
> > > Now I run it using EXPLAIN command.
> >
> > Please post the EXPLAIN ANALYZE output -- that'll show how accurate
> > the planner's estimates are.
> >
> > Has the table been vacuumed and analyzed lately?
> >
> > >This query has some results:
> > >
> > > explain SELECT "Name" FROM test WHERE "Owner"='root'
> > >
> > > "Seq Scan on test  (cost=0.00..119.11 rows=263 width=11)"
> > > "  Filter: (("Owner")::text = 'root'::text)"
> >
> > The planner estimates that this query will return 263 rows; apparently
> > that's enough of the table that the planner thinks a sequence scan
> > would be faster than an index scan.  An index scan has to hit the
> > index *and* the table, so it's a fallacy to assume that an index
> > scan will always be faster.  You can play with enable_seqscan to
> > see if an index scan would indeed be faster.  For example:
> >
> > SET enable_seqscan TO off;
> > EXPLAIN ANALYZE SELECT ...
> > SET enable_seqscan TO on;
> > EXPLAIN ANALYZE SELECT ...
> >
> > Be aware of disk caching when comparing execution times for different
> > queries: one query might be slower than another not because of a
> > less efficient plan, but rather because it had to fetch data from
> > disk and the "faster" query then took advantage of that cached data.
> > Run each query several times to allow for this.
> >
> > > Query without results:
> > >
> > > explain SELECT "Name" FROM test WHERE "Owner"='blah'
> > >
> > > "Index Scan using idx_test_owner on test  (cost=0.00..96.56 rows=28
> > > width=11)"
> > > "  Index Cond: (("Owner")::text = 'blah'::text)"
> >
> > The planner estimates that this query will return 28 rows, which
> > makes it more likely that an index scan would be faster because
> > that's a much smaller percentage of the table.
> >
> > > Why is this happening? Is it because of the memory? I'm running on
> > default
> > > db settings, version 8.0 and SUSE 10.
> >
> > You can use various tuning guides to help adjust your settings.
> > Here are a couple of links:
> >
> > http://www.powerpostgresql.com/PerfList
> > http://www.revsys.com/writings/postgresql-performance.html
> >
> > Configuration settings can lead the planner to favor index scans,
> > but as I mentioned earlier, and index scan isn't always faster than
> > a sequential scan.
> >
> > --
> > Michael Fuhr
> >

--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461