On Sat, Jan 21, 2017 at 1:57 PM, Stephen Frost <sfrost@snowman.net> wrote:
All, * Simon Riggs (simon@2ndquadrant.com) wrote: > On 12 August 2016 at 01:01, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Michael Paquier <michael.paquier@gmail.com> writes: > >> In short, autovacuum will need to scan by itself the VM of each > >> relation and decide based on that. > > > > That seems like a worthwhile approach to pursue. The VM is supposed to be > > small, and if you're worried it isn't, you could sample a few pages of it. > > I do not think any of the ideas proposed so far for tracking the > > visibility percentage on-the-fly are very tenable. > > Sounds good, but we can't scan the VM for every table, every minute. > We need to record something that will tell us how many VM bits have > been cleared, which will then allow autovac to do a simple SELECT to > decide what needs vacuuming. > > Vik's proposal to keep track of the rows inserted seems like the best > approach to this issue.
I tend to agree with Simon on this. I'm also worried that an approach which was based off of a metric like "% of table not all-visible" might result in VACUUM running over and over on a table because it isn't able to actually make any progress towards improving that percentage. We'd have to have some kind of "cool-off" period or something.
This is why I didn't want a global guc for it but only a relopt. I figured I would set it only for tables which I have good reason to know would benefit, because I know that they are both candidates for beneficial IOS, and because it is possible to keep them mostly all-visible with a reasonable amount of vacuum work. I think that this is a small but important subset of tables (currently, I personally have zero such tables, but that could increase if covering indexes get implemented). It would be nice to have settings that users of all experience level would understand (or no settings at all), but I don't think that it is feasible to have that without compromising the basic functionality.
Tracking INSERTs and then kicking off a VACUUM based on them seems to address that in a natural way and also seems like something that users would generally understand as it's very similar to what we do for UPDATEs and DELETEs.
If we do an INSERT based count with an extra knob to control how that gets weighted when added to the vacuum threshold function, then I could use that knob to micromanage to my satisfaction. If there is no knob added, then I think that I and many other people are going to see their vacuum workload skyrocket for no benefit and with no recourse, other than disabling autovac.