Thread: Re: [ADMIN] Index not used. WHY?

Re: [ADMIN] Index not used. WHY?

From

Stephan Szabo

Date:

04 December 2003, 11:21:29

On Thu, 4 Dec 2003, Andrei Bintintan wrote:

> Hi,
>
> I have the following table:
> CREATE TABLE public.rights (
> id int4 DEFAULT nextval('"rights_id_seq"'::text) NOT NULL,
> id_user int4 NOT NULL,
> id_modull int4 NOT NULL,
> CONSTRAINT rights_pkey PRIMARY KEY (id)
> )
>
> and I created the following indexes:
>
> CREATE INDEX right_id_modull_idx ON rights USING btree (id_modull);
> CREATE INDEX right_id_user_idx ON rights USING btree (id_user);
>
> Now the problem:
>
> EXPLAIN SELECT * FROM rights r WHERE r.id_modull =15
> returnes:
> Seq Scan on rights r (cost=0.00..12.30 rows=42 width=12)
> Filter: (id_modull = 15)
>
> EXPLAIN SELECT * FROM rights r WHERE r.id_user =15
> returnes:
> Index Scan using right_id_user_idx on rights r (cost=0.00..8.35 rows=11 width=12)
> Index Cond: (id_user = 15)
>
> Question: Why the right_id_modull_idx is NOT USED at the 1st query and
> the second query the right_id_user_idx index is used.

As a note, pgsql-performance is a better list for these questions.

So, standard questions:

How many rows are in the table, what does EXPLAIN ANALYZE show for the
queries, if you force index usage (set enable_seqscan=off) on the first
what does EXPLAIN ANALYZE show then, have you used ANALYZE/VACUUM ANALYZE
recently?

Re: [ADMIN] Index not used. WHY?

From

"Andrei Bintintan"

Date:

05 December 2003, 04:09:16

There are around 700 rows in this table.
If I set enable_seqscan=off then the index is used and I also used Vacuum
Analyze recently.

I find it strange because the number of values of id_user and id_modull are
somehow in the same distribution and when I search the table the id_user
index is used but the id_modull index is not used.

Does somehow postgre know that a seq scan runs faster in this case as a
index scan? Should I erase this index?
I have to say that the data's in this table are not changed offen, but there
are a LOT of joins made with this table.

Best regards.
Andy.


----- Original Message -----
From: "Stephan Szabo" <sszabo@megazone.bigpanda.com>
To: "Andrei Bintintan" <klodoma@ar-sd.net>
Cc: <pgsql-admin@postgresql.org>; <pgsql-performance@postgresql.org>
Sent: Thursday, December 04, 2003 5:19 PM
Subject: Re: [ADMIN] Index not used. WHY?


>
> On Thu, 4 Dec 2003, Andrei Bintintan wrote:
>
> > Hi,
> >
> > I have the following table:
> > CREATE TABLE public.rights (
> > id int4 DEFAULT nextval('"rights_id_seq"'::text) NOT NULL,
> > id_user int4 NOT NULL,
> > id_modull int4 NOT NULL,
> > CONSTRAINT rights_pkey PRIMARY KEY (id)
> > )
> >
> > and I created the following indexes:
> >
> > CREATE INDEX right_id_modull_idx ON rights USING btree (id_modull);
> > CREATE INDEX right_id_user_idx ON rights USING btree (id_user);
> >
> > Now the problem:
> >
> > EXPLAIN SELECT * FROM rights r WHERE r.id_modull =15
> > returnes:
> > Seq Scan on rights r (cost=0.00..12.30 rows=42 width=12)
> > Filter: (id_modull = 15)
> >
> > EXPLAIN SELECT * FROM rights r WHERE r.id_user =15
> > returnes:
> > Index Scan using right_id_user_idx on rights r (cost=0.00..8.35 rows=11
width=12)
> > Index Cond: (id_user = 15)
> >
> > Question: Why the right_id_modull_idx is NOT USED at the 1st query and
> > the second query the right_id_user_idx index is used.
>
> As a note, pgsql-performance is a better list for these questions.
>
> So, standard questions:
>
> How many rows are in the table, what does EXPLAIN ANALYZE show for the
> queries, if you force index usage (set enable_seqscan=off) on the first
> what does EXPLAIN ANALYZE show then, have you used ANALYZE/VACUUM ANALYZE
> recently?
>

Re: [ADMIN] Index not used. WHY?

From

Shridhar Daithankar

Date:

05 December 2003, 04:40:38

Andrei Bintintan wrote:

> There are around 700 rows in this table.
> If I set enable_seqscan=off then the index is used and I also used Vacuum
> Analyze recently.

For 700 rows I think seq. would work best.
>
> I find it strange because the number of values of id_user and id_modull are
> somehow in the same distribution and when I search the table the id_user
> index is used but the id_modull index is not used.
>
> Does somehow postgre know that a seq scan runs faster in this case as a
> index scan? Should I erase this index?
> I have to say that the data's in this table are not changed offen, but there
> are a LOT of joins made with this table.

If table is cached then it does not matter. Unless it grows substantially, say
to around hundred thousand rows(Note your table is small), idex wouldn't be that
useful.

  Shridhar

Re: [ADMIN] Index not used. WHY?

From

Stephan Szabo

Date:

05 December 2003, 11:35:34

On Fri, 5 Dec 2003, Andrei Bintintan wrote:

> There are around 700 rows in this table.
> If I set enable_seqscan=off then the index is used and I also used Vacuum
> Analyze recently.
>
> I find it strange because the number of values of id_user and id_modull are
> somehow in the same distribution and when I search the table the id_user
> index is used but the id_modull index is not used.

It was guessing that one would return 11 rows and the other 42 which is
why one used the index and the other wouldn't.  If those numbers aren't
realistic, you may want to raise the statistics target for the columns
(see ALTER TABLE) and re-run analyze.

> Does somehow postgre know that a seq scan runs faster in this case as a
> index scan? Should I erase this index?

It's making an educated guess.  When you're doing an index scan, it needs
to read through the index and then get matching rows from the table.
However, because those reads from the table are in a potentially random
order, there's usually a higher cost associated with those reads than if
the table was read in order (barring cases where you know your database
should always stay cached in disk cache, etc...).  If there's say 50 pages
in the entire table, a sequence scan does 50 sequential page reads and is
checking all those tuples.  If you're getting say 42 rows through an
index, you're first reading through the index, and then getting <n> pages
in a random order from the table where <n> depends on the distribution of
values throughout the table.  There's a variable in the configuration,
random_page_cost which controls the ratio of cost between a sequential
read and a random one (defaulting to 4).