Thread: [RFC] CLUSTER VERBOSE
Hi folks, I figure - I should start brand new thread for this one - so here you go. I am in a need for verbose CLUSTER. Ie. one that would give me feedback and progress. Because CLUSTER is divided into two major operations, (data reordering, index rebuild) - I see it this way: CLUSTER on I: <index name> T: <table name>, data reordering CLUSTER on I: <index name> T: <table name>, index rebuild and than: CLUSTER 10% CLUSTER 12% , etc (yeah, I know how hard it is to write good progress ..). I don't have even slight doubt that it can be useful, just like "VACUUM VERBOSE" is. So no question about it. I am seeking for comments. Ideas. The patch would not be very intrusive, atm no one is using VERBOSE for CLUSTER, because it is not there. And nothing would change in this area. I am looking for opinions, on what information should be presented. Perhaps there's also use for some information it might gather elsewhere (stats, etc) - but that's not really my point atm. Thanks for all comments. btw, I would really appreciate not CCing me on this, I am subscribed here for yeaaars now (8.0 times). ta. -- Grzegorz Jaskiewicz C/C++ freelance for hire
On 3/15/07, Grzegorz Jaskiewicz <gj@pointblue.com.pl> wrote: > I figure - I should start brand new thread for this one - so here you > go. > > > I am in a need for verbose CLUSTER. Ie. one that would give me > feedback and progress. > Because CLUSTER is divided into two major operations, (data > reordering, index rebuild) - I see it this way: > > CLUSTER on I: <index name> T: <table name>, data reordering > CLUSTER on I: <index name> T: <table name>, index rebuild > > and than: > CLUSTER 10% > CLUSTER 12% , etc Well, I'm afraid that would be inconsistent with other VERBOSE commands (VACUUM VERBOSE), which don't give a progress indication other than that of specific stage being finished. I think if you want to add VERBOSE to cluster, it should behave exactly like all other 'VERBOSE' commands. And as for progress indication, there has been proposals for more or less similar feature, like: http://archives.postgresql.org/pgsql-hackers/2006-07/msg00719.php As I recall the ideas which caught most traction were indicating current progress via shared memory (pg_stat_activity) and a GUC variable which instructs the server to send notices indicating the progress status. The latter is harder. I'm afraid creating such a feature 'just for CLUSTER' is not the greatest idea -- there a lots of other places where having a progress bar would be a great benefit. REINDEX, most ALTER TABLEs, CREATE INDEX, even long running SELECTs, UPDATEs and DELETEs not to mention VACUUM would equally benefit from it. I think you will be having hard time trying to push CLUSTER-specific extension when there is a need for more generic one. Regards, Dawid
Grzegorz Jaskiewicz wrote: > Because CLUSTER is divided into two major operations, (data reordering, > index rebuild) - I see it this way: > > CLUSTER on I: <index name> T: <table name>, data reordering > CLUSTER on I: <index name> T: <table name>, index rebuild Something like that would be nice to see how long each step takes, like vacuum verbose. > and than: > CLUSTER 10% > CLUSTER 12% , etc We don't have progress indicators for any other commands, and I don't see why we should add one for cluster in particular. Sure, progress indicators are nice, but we should rather try to add some kind of a general progress indicator support that would support SELECTs for example. I know it's much harder, but also much more useful. > I am looking for opinions, on what information should be presented. What would be useful is some kind of a metric of how (de)clustered the table was before CLUSTER, and the same # of dead vs. live row counts that vacuum verbose prints. We don't really have a good metric for clusteredness, as have been discussed before, so if you can come up with a good one that would be useful in the planner as well, that would be great. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Mar 16, 2007, at 9:53 AM, Heikki Linnakangas wrote: > Grzegorz Jaskiewicz wrote: >> Because CLUSTER is divided into two major operations, (data >> reordering, index rebuild) - I see it this way: >> CLUSTER on I: <index name> T: <table name>, data reordering >> CLUSTER on I: <index name> T: <table name>, index rebuild > > Something like that would be nice to see how long each step takes, > like vacuum verbose. yup. >> I am looking for opinions, on what information should be presented. > > What would be useful is some kind of a metric of how (de)clustered > the table was before CLUSTER, and the same # of dead vs. live row > counts that vacuum verbose prints. Is that information available in cluster.c atm ? I am looking for some hints here. One of the reasons I decided to go with this patch, is to learn something - and cluster seems to be touching very 'bone' of postgres, tuples system (just like vacuum), and indices. I would appreciate any hints. > We don't really have a good metric for clusteredness, as have been > discussed before, so if you can come up with a good one that would > be useful in the planner as well, that would be great. I really don't know where and how should I calculate such param. Any hints ? thanks. -- Grzegorz Jaskiewicz C/C++ freelance for hire
Added to TODO for CLUSTER: o %Add VERBOSE option to report tables as they are processed, like VACUUM VERBOSE --------------------------------------------------------------------------- Grzegorz Jaskiewicz wrote: > > On Mar 16, 2007, at 9:53 AM, Heikki Linnakangas wrote: > > > Grzegorz Jaskiewicz wrote: > >> Because CLUSTER is divided into two major operations, (data > >> reordering, index rebuild) - I see it this way: > >> CLUSTER on I: <index name> T: <table name>, data reordering > >> CLUSTER on I: <index name> T: <table name>, index rebuild > > > > Something like that would be nice to see how long each step takes, > > like vacuum verbose. > yup. > > > >> I am looking for opinions, on what information should be presented. > > > > What would be useful is some kind of a metric of how (de)clustered > > the table was before CLUSTER, and the same # of dead vs. live row > > counts that vacuum verbose prints. > Is that information available in cluster.c atm ? I am looking for > some hints here. One of the reasons I decided to go with this patch, > is to learn something - and cluster seems to be touching very 'bone' > of postgres, > tuples system (just like vacuum), and indices. I would appreciate any > hints. > > > We don't really have a good metric for clusteredness, as have been > > discussed before, so if you can come up with a good one that would > > be useful in the planner as well, that would be great. > > > I really don't know where and how should I calculate such param. Any > hints ? > > thanks. > > -- > Grzegorz Jaskiewicz > > C/C++ freelance for hire > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +