Re: tuning autovacuum - Mailing list pgsql-hackers

From Greg Smith
Subject Re: tuning autovacuum
Date
Msg-id 4DF13CBB.3020302@2ndQuadrant.com
Whole thread Raw
In response to Re: tuning autovacuum  (Bernd Helmle <mailings@oopsware.de>)
Responses Re: tuning autovacuum
List pgsql-hackers
On 06/09/2011 04:43 PM, Bernd Helmle wrote:
> I'd go further and expose the info or details issued by VACUUM VERBOSE 
> into the view, too, at least the number of pages visited and cleaned 
> (or dead but not yet cleaned). Customers are heavily interested in 
> these numbers and i've found pgfouine to provide those numbers very 
> useful.

Agreed there.  The fact that VACUUM VERBOSE reports them suggests 
they're not too terribly difficult to track either.

What we'd probably need to do with those is handle them like the other 
stats in the system:  store a total number for visited/cleaned/dead for 
each relation, then increment the total as each vacuum finishes.  That 
way, you could point a standard monitoring system at it and see trends.  
Just saving the last snapshot of data there isn't as useful.

I'm seeing these as being like the counters in pg_stat_bgwriter; while 
it's easy to think of VACUUM "what work happened?" data as info you just 
want the last snapshot of, a continuous incrementing counter can do that 
and a lot of other things too.  Anyone who is extracting useful data 
from pg_stat_bgwriter can use the same logic to track this data, even if 
it only moves forward in big chunks as vacuum completes.  And it may be 
feasible to update it in the middle, too.

Stepping into implementation for a second, the stats that are showing up 
in pg_stat_user_tables are being driven by a PgStat_MsgVacuum message 
coming out of the stats collector when it finishes.  While that's the 
obvious place to put some more stuff, that's not necessarily the right 
way to build a better monitoring infrastructure.  Two things to consider:

-It's not really aimed at being called multiple times for one operation 
("needs vacuum", "started vacuum", "finished vacuum"
-There is a mix of things that make sense as long-term counters and 
things that update as snapshots--the timestamps are the main thing there.

I haven't thought about it enough to have a real opinion on whether you 
can squeeze everything into the existing message by adding more fields, 
or if another type of message is necessary.  Just pointing out that it's 
not trivially obvious which approach is better.

What is unambiguous is that all this new data is really going to need a 
new view for it, pg_stat_vacuum or something like that.  The fields that 
are already in pg_stat_user_tables can stay there as deprecated for a 
while, but this all wants to be in its own new view.

This would really be a nice medium sized feature that DBAs would love, 
and it would help adoption on big sites.  I have some ideas on how to 
get some funding to develop it because I keep running into this, but if 
someone wants to run with the idea I'd be happy to just help instead.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Postmaster holding unlinked files for pg_largeobject table
Next
From: Noah Misch
Date:
Subject: Re: On-the-fly index tuple deletion vs. hot_standby