Re: questions about CLUSTER - Mailing list pgsql-performance
From | Douglas J Hunley |
---|---|
Subject | Re: questions about CLUSTER |
Date | |
Msg-id | 200802271335.16397.doug@hunley.homeip.net Whole thread Raw |
In response to | Re: questions about CLUSTER (Bill Moran <wmoran@collaborativefusion.com>) |
Responses |
Re: questions about CLUSTER
|
List | pgsql-performance |
On Wednesday 27 February 2008 12:40:57 Bill Moran wrote: > In response to Douglas J Hunley <doug@hunley.homeip.net>: > > After reviewing > > http://www.postgresql.org/docs/8.3/static/sql-cluster.html a couple of > > times, I have some questions: > > 1) it says to run analyze after doing a cluster. i'm assuming autovacuum > > will take care of this? or should i go ahead and do the analyze 'now' > > instead of waiting? > It's always a good idea to analyze after major DB operations. Autovacuum > only runs so often. Also, it won't hurt anything, so why risk not doing > it? being overly-cautious. i was concerned about both autovac and me doing analyzes over each other > > > 2) is there any internal data in the db that would allow me to > > programmatically determine which tables would benefit from being > > clustered? 3) for that matter, is there info to allow me to determine > > which index it should be clustered on in cases where the table has more > > than one index? > > The pg_stat_user_indexes table keeps stats on how often the index is used. > Indexes that are used frequently are candidates for clustering. I had just started looking at this actually. > > > 4) for tables with >1 indexes, does clustering on one index negatively > > impact queries that use the other indexes? > > Not likely. Clustering only really helps performance if you have an index > that is used to gather ranges of data. For example, if you frequently > do things like SELECT * FROM log WHERE logdate > 'somedate" and < > 'somedate, you might benefit from clustering on logdate. > > But it doesn't really do much if you're only ever pulling one record at a > time. It's the kind of thing that you really need to experiment with to > understand whether it will have a worthwhile performance impact on your > data and your workload. I doubt if there's any pat answer. makes sense. > > > 5) is it better to cluster on a compound index (index on > > lastnamefirstname) or on the underlying index (index on lastname)? > > If cluster helps you at all, it's going to help if you have an index that's > frequently used to fetch ranges of data. Whether that index is compound or > not isn't likely to factor in. understood. i didn't really think it would matter, but its easier to ask than to screw up performance for existing customers :) -- Douglas J Hunley (doug at hunley.homeip.net) - Linux User #174778 http://doug.hunley.homeip.net If a turtle doesn't have a shell, is he homeless or naked?
pgsql-performance by date: