Re: What determines the cost of an index scan? - Mailing list pgsql-general

From Craig Ringer
Subject Re: What determines the cost of an index scan?
Date
Msg-id 49618B95.4060503@postnewspapers.com.au
Whole thread Raw
In response to Re: What determines the cost of an index scan?  (Gregory Stark <stark@enterprisedb.com>)
List pgsql-general
Gregory Stark wrote:
> Christian Schröder <cs@deriva.de> writes:
>
>> Where does this difference come from? Pure cpu performance? Do the additional
>> fields in the productive database have an impact on the performance? Or do I
>> miss something?
>
> Sure, more data takes more time to process.
>
> Other factors which could affect it is how much dead space is in the table due
> to previous updates and deletes, as well as how fragmented the indexes have
> become over time.

In other words: on the production server you might want to VACUUM FULL,
REINDEX, and test again. Even better, use CLUSTER rather than VACUUM
FULL, since it'll be faster and will order your data on disk according
to your selected index (usually the primary key) as well.

REINDEX, CLUSTER, and VACUUM FULL will completely prevent all access to
the table being operated on while they run, as they all take an ACCESS
EXCLUSIVE lock. See:
http://www.postgresql.org/docs/8.1/static/explicit-locking.html

As a result you'll want to schedule a downtime window or at least do it
while the affected tables aren't needed. You should also check if you
actually need to do it first. You can get an estimate of table bloat
from the catalog data with a bit of massaging. See:

http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html

http://www.postgresql.org/docs/current/static/functions-admin.html#FUNCTIONS-ADMIN-DBSIZE

(It'd be REALLY, REALLY useful to have functions something like
pg_table_bloat(tablename) and pg_index_bloat(indexname) as part of Pg,
actually).

Note that under normal circumstances you should not run VACUUM FULL.
However, it can be useful if your tables have become really bloated due
to insufficient fsm_map space, infrequent VACUUMing, etc. If you do run
it, it's probably wise to also run REINDEX on the table(s) you ran
VACUUM FULL on.

--
Craig Ringer

pgsql-general by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Per-user schemas with inherited skeleton.
Next
From: "Jolles, Peter M (GE Infra, Energy)"
Date:
Subject: Re: Timestamp shift when importing data