Thread: Indeces vs small tables

Indeces vs small tables

From
Francisco Reyes
Date:
For small tables, less than 100 rows,  does it pay off to have an index?

I have 3 tables:
system table
system    serial
name    varchar(15)

heartbeat table
system     integer references system table
operation integer references operation table
time timestamp

operations table
operation serial
name    varchar(15)


I will be loading some log data to the heartbeat table, but the program
that produce the data will be creating something along the lines:
system1, update
system2, delete

I would use the "name" field on the system table to find the system ID and
then insert that on the heartbeat table. Likewise I would use the name
field on the operations table to find the operation ID and insert that on
the heartbeat table.

For every log entry 1 search will be performed on system and another on
operations. The question is whether an index on those two fields would
help.

Doing some explain select tests always returned a "sequential scan"
regardless of whether I had an index or not. So is it unnecessary to
create indeces for such small tables?




Re: Indeces vs small tables

From
Tom Lane
Date:
Francisco Reyes <lists@natserv.com> writes:
> For small tables, less than 100 rows,  does it pay off to have an index?

The planner is of the opinion that it doesn't ;-)

The planner's cost model says that disk fetches cost way more than
comparison operations, so an index is unlikely to be considered
profitable unless scanning it saves more fetches than it costs.
If the table is only a couple of disk pages in size, there's no
way that doing the extra I/O to read an index (also a couple of
pages) can pay for itself.

On a really heavily used table, this cost model might break down
because the pages would all remain in shared memory anyway.  But
that's how the planner will bet, so you'd have to go out of your
way to persuade it to use the index.

My take on it is that for such a small table, it hardly matters
which plan is chosen...

            regards, tom lane

Re: Indeces vs small tables

From
Francisco Reyes
Date:
On Sun, 5 Aug 2001, Tom Lane wrote:

> Francisco Reyes <lists@natserv.com> writes:
> > For small tables, less than 100 rows,  does it pay off to have an index?
...
> The planner's cost model says that disk fetches cost way more than
> comparison operations, so an index is unlikely to be considered
> profitable unless scanning it saves more fetches than it costs.

> On a really heavily used table, this cost model might break down
> because the pages would all remain in shared memory anyway.
>
> My take on it is that for such a small table, it hardly matters
> which plan is chosen...
>Regards, tom lane

Thanks for the feedback. I am currently designing the tables so only have
a few records. This may be the primary reason for the sequential scan.

Allan Engelhardt, did a quick test and it seems the optimizer did choose
the index. See his message below (minus part of my message removed).

The approach I am planning to take is to create the indices. After a few
days of production operationa and doing regular vacuum analyze then I will do
an explain query to see what the optimizer decides.

-------
Date: Sun, 05 Aug 2001 21:23:22 +0100
From: Allan Engelhardt <allane@cybaea.com>
To: Francisco Reyes <lists@natserv.com>
Newsgroups: comp.databases.postgresql.novice
Subject: Re: Indeces vs small tables

Francisco Reyes wrote:

> For small tables, less than 100 rows,  does it pay off to have an index?
> [...]

It does seem to make a difference on my installation (similar to your
example, I think?):

test=# create table system as select relname from pg_class;
SELECT
 test=# alter table system add column system serial; -- doesn't really do much....
ALTER
test=# explain select * from system where relname = 'foo';
NOTICE:  QUERY PLAN:

Seq Scan on system  (cost=0.00..22.50 rows=10 width=36)

EXPLAIN
test=# create unique index system_foo on system(relname);
CREATE
test=# explain select * from system where relname = 'foo';
NOTICE:  QUERY PLAN:

Index Scan using system_foo on system  (cost=0.00..2.01 rows=1 width=36)
EXPLAIN
test=#

Did you not do a VACUUM ANALYZE after your 100 INSERTs or something?
--- Allan.