Re: How frequently to defrag(cluster) - Mailing list pgsql-admin

From Steve Crawford
Subject Re: How frequently to defrag(cluster)
Date
Msg-id 4E2745B0.1010405@pinpointresearch.com
Whole thread Raw
In response to Re: How frequently to defrag(cluster)  (Steve Crawford <scrawford@pinpointresearch.com>)
List pgsql-admin
On 07/20/2011 02:04 PM, Steve Crawford wrote:
On 07/20/2011 12:58 PM, A J wrote:
I understand that 'cluster' performs the role of defrag ...
As with everything the answer is "it depends". For a "typical" workload where the rows updated by a single query are one or a few rowsl, the automatic vacuum process should handle everything for you without intervention (some tuning of the vacuum settings may be in order but it runs by itself).


Also, there are some workloads that have queries that grab chunks of data, say all records for a given date, where keeping the associated data physically close can improve performance.
I should have explained better. Cluster physically orders the data in the table in the same order as the index selected for the cluster. It is this physical reordering of data that can improve performance by keeping the data you are likely to retrieve in the same physical area. (Yes, the data can be scattered by the OS but clustering still helps). Also note that as you perform updates, the data will become more and more unordered. If your workload benefits substantially from clustering, it will have to be done periodically based on your workload and observed performance degradation.

Another place where cluster is useful is following deletion of large chunks of data as when archiving. For many workloads, however, table partitioning using parent/child tables organized so that archiving can be as simple as backing up then dropping a child table can be preferable.

Cheers,
Steve

pgsql-admin by date:

Previous
From: Steve Crawford
Date:
Subject: Re: How frequently to defrag(cluster)
Next
From: Johann Spies
Date:
Subject: Out of memory