Home > mailing lists

Re: [PATCH] Statistics collection for CLUSTER command - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: [PATCH] Statistics collection for CLUSTER command
Date	October 22, 2013 19:36:23
Msg-id	CA+TgmoZ0hw=Z5ihJ7Onhmy1trk-+ZM2zgKernHoHmAH=cjmeOQ@mail.gmail.com Whole thread Raw
In response to	Re: [PATCH] Statistics collection for CLUSTER command (Noah Misch <noah@leadboat.com>)
List	pgsql-hackers

Tree view

On Sun, Oct 20, 2013 at 1:37 AM, Noah Misch <noah@leadboat.com> wrote:
>> > (2013/08/08 20:52), Vik Fearing wrote:
>> >> As part of routine maintenance monitoring, it is interesting for us to
>> >> have statistics on the CLUSTER command (timestamp of last run, and
>> >> number of runs since stat reset) like we have for (auto)ANALYZE and
>> >> (auto)VACUUM.  Patch against today's HEAD attached.
>
> Adding new fields to PgStat_StatTabEntry imposes a substantial distributed
> cost, because every database stats file write-out grows by the width of those
> fields times the number of tables in the database.  Associated costs have been
> and continue to be a pain point with large table counts:
>
> http://www.postgresql.org/message-id/flat/1718942738eb65c8407fcd864883f4c8@fuzzy.cz
> http://www.postgresql.org/message-id/flat/52268887.9010509@uptime.jp
>
> In that light, I can't justify widening PgStat_StatTabEntry by 9.5% for this.
> I recommend satisfying this monitoring need in your application by creating a
> cluster_table wrapper function that issues CLUSTER and then updates statistics
> you store in an ordinary table.  Issue all routine CLUSTERs by way of that
> wrapper function.  A backend change that would help here is to extend event
> triggers to cover the CLUSTER command, permitting you to inject monitoring
> after plain CLUSTER and dispense with the wrapper.

I unfortunately have to agree with this, but I think it points to the
need for further work on the pgstat infrastructure.  We used to have
one file; now we have one per database.  That's better for people with
lots of databases, but many people just have one big database.  We
need a solution here that relieves the pain for those people.

(I can't help thinking that the root of the problem here is that we're
rewriting the whole file, and that any solution that doesn't somehow
facilitate updates of individual records will be only a small
improvement.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Tom Lane
Date: 22 October 2013, 19:35:45
Subject: Reasons not to like asprintf

From: Andres Freund
Date: 22 October 2013, 19:43:59
Subject: Re: logical changeset generation v6.2

Re: [PATCH] Statistics collection for CLUSTER command - Mailing list pgsql-hackers

Previous

Next