Thread: [PATCH] Statistics collection for CLUSTER command

[PATCH] Statistics collection for CLUSTER command

From

Vik Fearing

Date:

08 August 2013, 11:52:56

As part of routine maintenance monitoring, it is interesting for us to
have statistics on the CLUSTER command (timestamp of last run, and
number of runs since stat reset) like we have for (auto)ANALYZE and
(auto)VACUUM.  Patch against today's HEAD attached.

I would add this to the next commitfest but I seem to be unable to log
in with my community account (I can log in to the wiki).  Help appreciated.

Attachment

clusterstats.patch

Re: [PATCH] Statistics collection for CLUSTER command

From

Fabien COELHO

Date:

08 August 2013, 12:26:05

> As part of routine maintenance monitoring, it is interesting for us to
> have statistics on the CLUSTER command (timestamp of last run, and
> number of runs since stat reset) like we have for (auto)ANALYZE and
> (auto)VACUUM.  Patch against today's HEAD attached.
>
> I would add this to the next commitfest but I seem to be unable to log
> in with my community account (I can log in to the wiki).  Help appreciated.

Done.

-- 
Fabien.

Re: [PATCH] Statistics collection for CLUSTER command

From

Stefan Kaltenbrunner

Date:

08 August 2013, 17:57:44

On 08/08/2013 01:52 PM, Vik Fearing wrote:
> As part of routine maintenance monitoring, it is interesting for us to
> have statistics on the CLUSTER command (timestamp of last run, and
> number of runs since stat reset) like we have for (auto)ANALYZE and
> (auto)VACUUM.  Patch against today's HEAD attached.
> 
> I would add this to the next commitfest but I seem to be unable to log
> in with my community account (I can log in to the wiki).  Help appreciated.

whould be a bit easier to diagnose if we knew your community account name ;)


Stefan

Re: [PATCH] Statistics collection for CLUSTER command

From

Vik Fearing

Date:

08 August 2013, 22:02:23

<div class="moz-text-plain" graphical-quote="true" lang="x-unicode" style="font-family: -moz-fixed; font-size: 12px;"
wrap="true"><prewrap="">On 08/08/2013 07:57 PM, Stefan Kaltenbrunner wrote:
 
</pre><blockquote style="color: #3333FF;" type="cite"><pre wrap="">On 08/08/2013 01:52 PM, Vik Fearing wrote:
</pre><blockquote style="color: #3333FF;" type="cite"><pre wrap="">I would add this to the next commitfest but I seem
tobe unable to log
 
in with my community account (I can log in to the wiki).  Help appreciated.
</pre></blockquote><pre wrap="">whould be a bit easier to diagnose if we knew your community account name <span
class="moz-smiley-s3"title=";)"></span>
 
</pre></blockquote><pre wrap="">Sorry, it's "glaucous".

Vik
</pre></div>

Re: [PATCH] Statistics collection for CLUSTER command

From

Vik Fearing

Date:

09 August 2013, 07:44:35

On 08/08/2013 02:26 PM, Fabien COELHO wrote:
>
>> As part of routine maintenance monitoring, it is interesting for us to
>> have statistics on the CLUSTER command (timestamp of last run, and
>> number of runs since stat reset) like we have for (auto)ANALYZE and
>> (auto)VACUUM.  Patch against today's HEAD attached.
>>
>> I would add this to the next commitfest but I seem to be unable to log
>> in with my community account (I can log in to the wiki).  Help
>> appreciated.
>
> Done.
>

Thank you, but it seems you've duplicated the title from the other patch
(and thanks for adding that one, too!).

https://commitfest.postgresql.org/action/patch_view?id=1190

Vik

Re: [PATCH] Statistics collection for CLUSTER command

From

Fabien COELHO

Date:

09 August 2013, 08:04:23

> Thank you, but it seems you've duplicated the title from the other patch
> (and thanks for adding that one, too!).

Indeed, possibly a wrong copy paste. Fixed.

-- 
Fabien.

Re: [PATCH] Statistics collection for CLUSTER command

From

Stefan Kaltenbrunner

Date:

09 August 2013, 20:37:32

On 08/09/2013 12:02 AM, Vik Fearing wrote:
> On 08/08/2013 07:57 PM, Stefan Kaltenbrunner wrote:
> 
>> On 08/08/2013 01:52 PM, Vik Fearing wrote:
>>> I would add this to the next commitfest but I seem to be unable to log
>>> in with my community account (I can log in to the wiki).  Help appreciated.
>> whould be a bit easier to diagnose if we knew your community account name 
> 
> Sorry, it's "glaucous".

hmm looks like your account may be affected by one of the buglets
introduced (and fixed shortly afterwards) of the main infrastructure to
debian wheezy - please try logging in to the main website and change
your password at least once. That should make it working again for the
commitfest app...

Stefan

Re: [PATCH] Statistics collection for CLUSTER command

From

Vik Fearing

Date:

09 August 2013, 22:50:42

On 08/09/2013 10:37 PM, Stefan Kaltenbrunner wrote:
>>> On 08/08/2013 01:52 PM, Vik Fearing wrote:
>>>> I would add this to the next commitfest but I seem to be unable to log
>>>> in with my community account (I can log in to the wiki).  Help appreciated.
> hmm looks like your account may be affected by one of the buglets
> introduced (and fixed shortly afterwards) of the main infrastructure to
> debian wheezy - please try logging in to the main website and change
> your password at least once. That should make it working again for the
> commitfest app...

That worked.  Thank you.

Vik

Re: [PATCH] Statistics collection for CLUSTER command

From

Satoshi Nagayasu

Date:

16 September 2013, 06:26:14

(2013/08/08 20:52), Vik Fearing wrote:
> As part of routine maintenance monitoring, it is interesting for us to
> have statistics on the CLUSTER command (timestamp of last run, and
> number of runs since stat reset) like we have for (auto)ANALYZE and
> (auto)VACUUM.  Patch against today's HEAD attached.
>
> I would add this to the next commitfest but I seem to be unable to log
> in with my community account (I can log in to the wiki).  Help appreciated.

I have reviewed the patch.

Succeeded to build with the latest HEAD, and passed the regression
tests.

Looks good enough, and I'd like to add a test case here, not only
for the view definition, but also working correctly.

Please take a look at attached one.

Regards,
--
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp

Attachment

clusterstats_regress.patch

Re: [PATCH] Statistics collection for CLUSTER command

From

Vik Fearing

Date:

15 October 2013, 11:53:36

On 09/16/2013 08:26 AM, Satoshi Nagayasu wrote:
> (2013/08/08 20:52), Vik Fearing wrote:
>> As part of routine maintenance monitoring, it is interesting for us to
>> have statistics on the CLUSTER command (timestamp of last run, and
>> number of runs since stat reset) like we have for (auto)ANALYZE and
>> (auto)VACUUM.  Patch against today's HEAD attached.
>>
>> I would add this to the next commitfest but I seem to be unable to log
>> in with my community account (I can log in to the wiki).  Help
>> appreciated.
>
> I have reviewed the patch.

Thank you for your review.

> Succeeded to build with the latest HEAD, and passed the regression
> tests.
>
> Looks good enough, and I'd like to add a test case here, not only
> for the view definition, but also working correctly.
>
> Please take a look at attached one.

Looks good to me.  Attached is a rebased patch with those tests added.

--
Vik

Attachment

clusterstats.v2.patch

Re: [PATCH] Statistics collection for CLUSTER command

From

Noah Misch

Date:

20 October 2013, 05:37:11

> > (2013/08/08 20:52), Vik Fearing wrote:
> >> As part of routine maintenance monitoring, it is interesting for us to
> >> have statistics on the CLUSTER command (timestamp of last run, and
> >> number of runs since stat reset) like we have for (auto)ANALYZE and
> >> (auto)VACUUM.  Patch against today's HEAD attached.

Adding new fields to PgStat_StatTabEntry imposes a substantial distributed
cost, because every database stats file write-out grows by the width of those
fields times the number of tables in the database.  Associated costs have been
and continue to be a pain point with large table counts:

http://www.postgresql.org/message-id/flat/1718942738eb65c8407fcd864883f4c8@fuzzy.cz
http://www.postgresql.org/message-id/flat/52268887.9010509@uptime.jp

In that light, I can't justify widening PgStat_StatTabEntry by 9.5% for this.
I recommend satisfying this monitoring need in your application by creating a
cluster_table wrapper function that issues CLUSTER and then updates statistics
you store in an ordinary table.  Issue all routine CLUSTERs by way of that
wrapper function.  A backend change that would help here is to extend event
triggers to cover the CLUSTER command, permitting you to inject monitoring
after plain CLUSTER and dispense with the wrapper.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com

Re: [PATCH] Statistics collection for CLUSTER command

From

Dimitri Fontaine

Date:

20 October 2013, 20:04:56

Noah Misch <noah@leadboat.com> writes:
> wrapper function.  A backend change that would help here is to extend event
> triggers to cover the CLUSTER command, permitting you to inject monitoring
> after plain CLUSTER and dispense with the wrapper.

I didn't look in any level of details, but it might be as simple as
moving the T_ClusterStmt case from standard_ProcessUtility() down to the
Event Trigger friendly part known as ProcessUtilitySlow().

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: [PATCH] Statistics collection for CLUSTER command

From

Robert Haas

Date:

22 October 2013, 16:36:23

On Sun, Oct 20, 2013 at 1:37 AM, Noah Misch <noah@leadboat.com> wrote:
>> > (2013/08/08 20:52), Vik Fearing wrote:
>> >> As part of routine maintenance monitoring, it is interesting for us to
>> >> have statistics on the CLUSTER command (timestamp of last run, and
>> >> number of runs since stat reset) like we have for (auto)ANALYZE and
>> >> (auto)VACUUM.  Patch against today's HEAD attached.
>
> Adding new fields to PgStat_StatTabEntry imposes a substantial distributed
> cost, because every database stats file write-out grows by the width of those
> fields times the number of tables in the database.  Associated costs have been
> and continue to be a pain point with large table counts:
>
> http://www.postgresql.org/message-id/flat/1718942738eb65c8407fcd864883f4c8@fuzzy.cz
> http://www.postgresql.org/message-id/flat/52268887.9010509@uptime.jp
>
> In that light, I can't justify widening PgStat_StatTabEntry by 9.5% for this.
> I recommend satisfying this monitoring need in your application by creating a
> cluster_table wrapper function that issues CLUSTER and then updates statistics
> you store in an ordinary table.  Issue all routine CLUSTERs by way of that
> wrapper function.  A backend change that would help here is to extend event
> triggers to cover the CLUSTER command, permitting you to inject monitoring
> after plain CLUSTER and dispense with the wrapper.

I unfortunately have to agree with this, but I think it points to the
need for further work on the pgstat infrastructure.  We used to have
one file; now we have one per database.  That's better for people with
lots of databases, but many people just have one big database.  We
need a solution here that relieves the pain for those people.

(I can't help thinking that the root of the problem here is that we're
rewriting the whole file, and that any solution that doesn't somehow
facilitate updates of individual records will be only a small
improvement.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company