Re: [PATCH] Statistics collection for CLUSTER command - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [PATCH] Statistics collection for CLUSTER command
Date
Msg-id CA+TgmoZ0hw=Z5ihJ7Onhmy1trk-+ZM2zgKernHoHmAH=cjmeOQ@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Statistics collection for CLUSTER command  (Noah Misch <noah@leadboat.com>)
List pgsql-hackers
On Sun, Oct 20, 2013 at 1:37 AM, Noah Misch <noah@leadboat.com> wrote:
>> > (2013/08/08 20:52), Vik Fearing wrote:
>> >> As part of routine maintenance monitoring, it is interesting for us to
>> >> have statistics on the CLUSTER command (timestamp of last run, and
>> >> number of runs since stat reset) like we have for (auto)ANALYZE and
>> >> (auto)VACUUM.  Patch against today's HEAD attached.
>
> Adding new fields to PgStat_StatTabEntry imposes a substantial distributed
> cost, because every database stats file write-out grows by the width of those
> fields times the number of tables in the database.  Associated costs have been
> and continue to be a pain point with large table counts:
>
> http://www.postgresql.org/message-id/flat/1718942738eb65c8407fcd864883f4c8@fuzzy.cz
> http://www.postgresql.org/message-id/flat/52268887.9010509@uptime.jp
>
> In that light, I can't justify widening PgStat_StatTabEntry by 9.5% for this.
> I recommend satisfying this monitoring need in your application by creating a
> cluster_table wrapper function that issues CLUSTER and then updates statistics
> you store in an ordinary table.  Issue all routine CLUSTERs by way of that
> wrapper function.  A backend change that would help here is to extend event
> triggers to cover the CLUSTER command, permitting you to inject monitoring
> after plain CLUSTER and dispense with the wrapper.

I unfortunately have to agree with this, but I think it points to the
need for further work on the pgstat infrastructure.  We used to have
one file; now we have one per database.  That's better for people with
lots of databases, but many people just have one big database.  We
need a solution here that relieves the pain for those people.

(I can't help thinking that the root of the problem here is that we're
rewriting the whole file, and that any solution that doesn't somehow
facilitate updates of individual records will be only a small
improvement.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Reasons not to like asprintf
Next
From: Andres Freund
Date:
Subject: Re: logical changeset generation v6.2