Re: questions about CLUSTER - Mailing list pgsql-performance

From Bill Moran
Subject Re: questions about CLUSTER
Date
Msg-id 20080227143449.adaea949.wmoran@collaborativefusion.com
Whole thread Raw
In response to Re: questions about CLUSTER  (Douglas J Hunley <doug@hunley.homeip.net>)
List pgsql-performance
In response to Douglas J Hunley <doug@hunley.homeip.net>:

> On Wednesday 27 February 2008 13:35:16 Douglas J Hunley wrote:
> > > > 2) is there any internal data in the db that would allow me to
> > > > programmatically determine which tables would benefit from being
> > > > clustered? 3) for that matter, is there info to allow me to determine
> > > > which index it should be clustered on in cases where the table has more
> > > > than one index?
> > >
> > > The pg_stat_user_indexes table keeps stats on how often the index is
> > > used. Indexes that are used frequently are candidates for clustering.
> >
> > I had just started looking at this actually.
>
> ok, so for a follow-on, should I be more concerned with idx_scan,
> idx_tup_read, or idx_tup_fetch when determining which indexes are 'good'
> candidates?

Again, not an easy question to answer, as it's probably different for
different people.

idx_scan is the count of how many times the index was used.
idx_tup_read and idx_tup_fetch are counts of how much data has been
accessed by using the index.

This part of the docs has more:
http://www.postgresql.org/docs/8.2/static/monitoring-stats.html

So, you'll probably have to watch all of those if you want to determine
when to automate clustering operations.

Personally, if I were you, I'd set up a test box and make sure
clustering makes enough of a difference to be doing all of this work.

> again, tia. i feel like such a noob around here :)

Bah ... we all start out as noobs.  Just don't go googling for my posts
from years back, it's embarrassing ...

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

pgsql-performance by date:

Previous
From: Jeff Davis
Date:
Subject: Re: disabling an index without deleting it?
Next
From: "Dan Kaplan"
Date:
Subject: Optimizing t1.col like '%t2.col%'