Re: Disadvantage to CLUSTER? - Mailing list pgsql-general

From Merlin Moncure
Subject Re: Disadvantage to CLUSTER?
Date
Msg-id CAHyXU0xKo6g_2YMoDf9jTjobE8t-ypCUsWrzDOQtKnHuYiENYA@mail.gmail.com
Whole thread Raw
In response to Disadvantage to CLUSTER?  (Robert James <srobertjames@gmail.com>)
List pgsql-general
On Tue, May 15, 2012 at 4:02 PM, Robert James <srobertjames@gmail.com> wrote:
> Besides the one time spent CLUSTERing, do I loose anything by doing it
> for every table?  Does a CLUSTER slow anything down?
>
> It would seem to me that a) a CLUSTER should never have worse
> performance than a random order b) may have better performance and c)
> has the benefits of a VACUUM and REINDEX.  So, I may as well cluster
> every table, if only by the primary key.

CLUSTER has huge disadvantages over VACUUM in that it's a full table
rewrite and (especially) has to take out a full table lock while it
operates.  CLUSTER by the way is not a persistent operation -- table
will drift way from index order over time.  CLUSTER is however a
general improvement over VACUUM FULL which also packs and reorganizes
the table.

For relatively small tales clustering for performance is pointless.
The reason to cluster is an artifact of the way postgres organizes
data into pages and is an attempt to make it so that you can get
related tuples that are frequently queried together physically
grouped.

If your primary key is a surrogate, it may not be that useful to
CLUSTER on it. For natural primary keys, it almost always makes sense
to cluster on the primary key.

merlin

pgsql-general by date:

Previous
From: Robert James
Date:
Subject: Disadvantage to CLUSTER?
Next
From: Merlin Moncure
Date:
Subject: Re: Is there a benefit to CLUSTER when retrieving individual records?