Thread: Questions regarding indexes

Questions regarding indexes

From
Bruno Lavoie
Date:
Hello,

are these statements true:
  • «You should always index fks. The only exception is when the matching unique or primary key is never updated or deleted» ?
  • «Small tables do not require indexes» ?
  • «Create an index if you frequently want to retrieve less than about ~15% of the rows in a large table»
  • «Index only columns with unique values, or only with a few duplicates»
Is it necessary to run ANALYZE on the table after any index creation to gather statistics about the index/column?

Thanks
Bruno Lavoie


Re: Questions regarding indexes

From
"Scott Marlowe"
Date:
On Fri, Jan 16, 2009 at 8:47 AM, Bruno Lavoie <bruno.lavoie@gmail.com> wrote:
> Hello,
>
> are these statements true:
>
> «You should always index fks. The only exception is when the matching unique
> or primary key is never updated or deleted» ?

No.  If the table that fks to another table has 10 rows and will never
have more than a few hundred, why index it?

> «Small tables do not require indexes» ?

Usually true, except for unique indexes, which provide more
functionality than just indexing.

Re: Questions regarding indexes

From
"Scott Marlowe"
Date:
On Fri, Jan 16, 2009 at 8:47 AM, Bruno Lavoie <bruno.lavoie@gmail.com> wrote:
> Hello,
>
> are these statements true:

Got interrupted by a coworker...  The other two questions:

> «Create an index if you frequently want to retrieve less than about ~15% of
> the rows in a large table»

PostgreSQL tends to switch to seq scans faster than other dbs due to
the fact that tuple visibility isn't stored in the indexes.  The
switchover point tends to be lower than 15% where it changes from
index to seq scan.  Typically it's in the 2 to 5% range, depending
very much on how you've configured postgresql.conf.

> «Index only columns with unique values, or only with a few duplicates»

You can also look at partial and / or functional indexes for some
apps.  For instance, if you have a table with a boolean field, and
it's 99.9% true, you can index only the false entries.  If you get a
lot of another field when the bool is false, then you can create an
index on that field. i.e.:

create index xyz on abc (intfield) where boolfield is false;

> Is it necessary to run ANALYZE on the table after any index creation to
> gather statistics about the index/column?

The order doesn't matter.  Analyze doesn't know anything about the
indexes, it knows about the fields / tables.  I.e. if you run analyze,
then create the index, you get the same basic result as if you create
the index then run analyze.  The time to run analyze is when a good
percentage of the table values have been changed.  Autovacuum will do
this for you, btw, but it might take a few minutes to kick in and see
the change in your query plans.

Re: Questions regarding indexes

From
Tom Lane
Date:
"Scott Marlowe" <scott.marlowe@gmail.com> writes:
> The order doesn't matter.  Analyze doesn't know anything about the
> indexes, it knows about the fields / tables.  I.e. if you run analyze,
> then create the index, you get the same basic result as if you create
> the index then run analyze.

There is an exception to that statement, which is indexes on functions
(expressions).  Existence of such an index prompts analyze to gather
stats about the indexed expression.  For plain column indexes, you're
right, it doesn't matter.

            regards, tom lane