Thread: [RFC] CLUSTER VERBOSE

[RFC] CLUSTER VERBOSE

From
Grzegorz Jaskiewicz
Date:
Hi folks,

I figure - I should start brand new thread for this one - so here you  
go.


I am in a need for verbose CLUSTER. Ie. one that would give me  
feedback and progress.
Because CLUSTER is divided into two major operations, (data  
reordering, index rebuild) - I see it this way:

CLUSTER on I: <index name> T: <table name>, data reordering
CLUSTER on I: <index name> T: <table name>, index rebuild

and than:
CLUSTER 10%
CLUSTER 12% , etc

(yeah, I know how hard it is to write good progress ..).

I don't have even slight doubt that it can be useful, just like  
"VACUUM VERBOSE" is. So no question about it.
I am seeking for comments. Ideas.
The patch would not be very intrusive, atm no one is using VERBOSE  
for CLUSTER, because it is not there. And nothing would change in  
this area.
I am looking for opinions, on what information should be presented.
Perhaps there's also use for some information it might gather  
elsewhere (stats, etc) - but that's not really my point atm.

Thanks for all comments.
btw, I would really appreciate not CCing me on this, I am subscribed  
here for yeaaars now (8.0 times).

ta.


-- 
Grzegorz Jaskiewicz

C/C++ freelance for hire







Re: [RFC] CLUSTER VERBOSE

From
"Dawid Kuroczko"
Date:
On 3/15/07, Grzegorz Jaskiewicz <gj@pointblue.com.pl> wrote:
> I figure - I should start brand new thread for this one - so here you
> go.
>
>
> I am in a need for verbose CLUSTER. Ie. one that would give me
> feedback and progress.
> Because CLUSTER is divided into two major operations, (data
> reordering, index rebuild) - I see it this way:
>
> CLUSTER on I: <index name> T: <table name>, data reordering
> CLUSTER on I: <index name> T: <table name>, index rebuild
>
> and than:
> CLUSTER 10%
> CLUSTER 12% , etc

Well, I'm afraid that would be inconsistent with other VERBOSE
commands (VACUUM VERBOSE), which don't give a progress
indication other than that of specific stage being finished.

I think if you want to add VERBOSE to cluster, it should behave
exactly like all other 'VERBOSE' commands.

And as for progress indication, there has been proposals for more
or less similar feature, like:
http://archives.postgresql.org/pgsql-hackers/2006-07/msg00719.php

As I recall the ideas which caught most traction were
indicating current progress via shared memory (pg_stat_activity)
and a GUC variable which instructs the server to send notices
indicating the progress status. The latter is harder.

I'm afraid creating such a feature 'just for CLUSTER' is not the greatest
idea -- there a lots of other places where having a progress bar would
be a great benefit.  REINDEX, most ALTER TABLEs, CREATE INDEX, even
long running SELECTs, UPDATEs and DELETEs not to mention VACUUM
would equally benefit from it.  I think you will be having hard time trying
to push CLUSTER-specific extension when there is a need for more
generic one.
  Regards,     Dawid


Re: [RFC] CLUSTER VERBOSE

From
Heikki Linnakangas
Date:
Grzegorz Jaskiewicz wrote:
> Because CLUSTER is divided into two major operations, (data reordering, 
> index rebuild) - I see it this way:
> 
> CLUSTER on I: <index name> T: <table name>, data reordering
> CLUSTER on I: <index name> T: <table name>, index rebuild

Something like that would be nice to see how long each step takes, like 
vacuum verbose.

> and than:
> CLUSTER 10%
> CLUSTER 12% , etc

We don't have progress indicators for any other commands, and I don't 
see why we should add one for cluster in particular. Sure, progress 
indicators are nice, but we should rather try to add some kind of a 
general progress indicator support that would support SELECTs for 
example. I know it's much harder, but also much more useful.

> I am looking for opinions, on what information should be presented.

What would be useful is some kind of a metric of how (de)clustered the 
table was before CLUSTER, and the same # of dead vs. live row counts 
that vacuum verbose prints.

We don't really have a good metric for clusteredness, as have been 
discussed before, so if you can come up with a good one that would be 
useful in the planner as well, that would be great.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: [RFC] CLUSTER VERBOSE

From
Grzegorz Jaskiewicz
Date:
On Mar 16, 2007, at 9:53 AM, Heikki Linnakangas wrote:

> Grzegorz Jaskiewicz wrote:
>> Because CLUSTER is divided into two major operations, (data  
>> reordering, index rebuild) - I see it this way:
>> CLUSTER on I: <index name> T: <table name>, data reordering
>> CLUSTER on I: <index name> T: <table name>, index rebuild
>
> Something like that would be nice to see how long each step takes,  
> like vacuum verbose.
yup.


>> I am looking for opinions, on what information should be presented.
>
> What would be useful is some kind of a metric of how (de)clustered  
> the table was before CLUSTER, and the same # of dead vs. live row  
> counts that vacuum verbose prints.
Is that information available in cluster.c atm ? I am looking for  
some hints here. One of the reasons I decided to go with this patch,  
is to learn something  - and cluster seems to be touching very 'bone'  
of postgres,
tuples system (just like vacuum), and indices. I would appreciate any  
hints.

> We don't really have a good metric for clusteredness, as have been  
> discussed before, so if you can come up with a good one that would  
> be useful in the planner as well, that would be great.


I really don't know where and how should I calculate such param. Any  
hints ?

thanks.

-- 
Grzegorz Jaskiewicz

C/C++ freelance for hire







Re: [RFC] CLUSTER VERBOSE

From
Bruce Momjian
Date:
Added to TODO for CLUSTER:
       o %Add VERBOSE option to report tables as they are processed,         like VACUUM VERBOSE


---------------------------------------------------------------------------

Grzegorz Jaskiewicz wrote:
> 
> On Mar 16, 2007, at 9:53 AM, Heikki Linnakangas wrote:
> 
> > Grzegorz Jaskiewicz wrote:
> >> Because CLUSTER is divided into two major operations, (data  
> >> reordering, index rebuild) - I see it this way:
> >> CLUSTER on I: <index name> T: <table name>, data reordering
> >> CLUSTER on I: <index name> T: <table name>, index rebuild
> >
> > Something like that would be nice to see how long each step takes,  
> > like vacuum verbose.
> yup.
> 
> 
> >> I am looking for opinions, on what information should be presented.
> >
> > What would be useful is some kind of a metric of how (de)clustered  
> > the table was before CLUSTER, and the same # of dead vs. live row  
> > counts that vacuum verbose prints.
> Is that information available in cluster.c atm ? I am looking for  
> some hints here. One of the reasons I decided to go with this patch,  
> is to learn something  - and cluster seems to be touching very 'bone'  
> of postgres,
> tuples system (just like vacuum), and indices. I would appreciate any  
> hints.
> 
> > We don't really have a good metric for clusteredness, as have been  
> > discussed before, so if you can come up with a good one that would  
> > be useful in the planner as well, that would be great.
> 
> 
> I really don't know where and how should I calculate such param. Any  
> hints ?
> 
> thanks.
> 
> -- 
> Grzegorz Jaskiewicz
> 
> C/C++ freelance for hire
> 
> 
> 
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +