Re: Berserk Autovacuum (let's save next Mandrill) - Mailing list pgsql-hackers

From Laurenz Albe
Subject Re: Berserk Autovacuum (let's save next Mandrill)
Date
Msg-id 91c110aad4b8653d6433d42fd707bd16074c79a5.camel@cybertec.at
Whole thread Raw
In response to Re: Berserk Autovacuum (let's save next Mandrill)  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Berserk Autovacuum (let's save next Mandrill)
Re: Berserk Autovacuum (let's save next Mandrill)
Re: Berserk Autovacuum (let's save next Mandrill)
List pgsql-hackers
Thanks, Justin, for the review.
I have applied the changes where still applicable.

On Fri, 2020-03-06 at 10:52 +1300, David Rowley wrote:
> On Fri, 6 Mar 2020 at 03:27, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > On Thu, 2020-03-05 at 19:40 +1300, David Rowley wrote:
> > > 1. I'd go for 2 new GUCs and reloptions.
> > > autovacuum_vacuum_insert_threshold (you're currently calling this
> > > autovacuum_vacuum_insert_limit. I don't see why the word "limit" is
> > > relevant here). The other GUC I think should be named
> > > autovacuum_vacuum_insert_scale_factor and these should work exactly
> > > the same way as autovacuum_vacuum_threshold and
> > > autovacuum_vacuum_scale_factor, but be applied in a similar way to the
> > > vacuum settings, but only be applied after we've checked to ensure the
> > > table is not otherwise eligible to be vacuumed.
> > 
> > I disagree about the scale_factor (and have not added it to the
> > updated version of the patch).  If we have a scale_factor, then the
> > time between successive autovacuum runs would increase as the table
> > gets bigger, which defeats the purpose of reducing the impact of each
> > autovacuum run.
> 
> My view here is not really to debate what logically makes the most
> sense.  I don't really think for a minute that the current
> auto-vacuums scale_factor and thresholds are perfect for the job. It's
> true that the larger a table becomes, the less often it'll be
> vacuumed, but these are control knobs that people have become
> accustomed to and I don't really think that making an exception for
> this is warranted.  Perhaps we can zero out the scale factor by
> default and set the threshold into the millions of tuples. We can have
> people chime in on what they think about that and why once the code is
> written and even perhaps committed.

Ok, I submit.  My main desire was to keep the number of new GUCs as
low as reasonably possible, but making the feature tunable along the
known and "trusted" lines may be a good thing.

The new parameter is called "autovacuum_vacuum_insert_scale_factor".

> Lack of a scale_factor does leave people who regularly truncate their
> "append-only" tables out in the cold a bit.  Perhaps they'd like
> index-only scans to kick in soon after they truncate without having to
> wait for 10 million tuples, or so.

That point I don't see.
Truncating a table resets the counters to 0.

> > > 10. I'm slightly worried about the case where we don't quite trigger a
> > > normal vacuum but trigger a vacuum due to INSERTs then skip cleaning
> > > up the indexes but proceed to leave dead index entries causing indexes
> > > to become bloated.  It does not seem impossible that given the right
> > > balance of INSERTs and UPDATE/DELETEs that this could happen every
> > > time and the indexes would just become larger and larger.
> > 
> > Perhaps we can take care of the problem by *not* skipping index
> > cleanup if "changes_since_analyze" is substantially greater than 0.
> > 
> > What do you think?
> 
> Well, there is code that skips the index scans when there are 0 dead
> tuples found in the heap. If the table is truly INSERT-only then it
> won't do any harm since we'll skip the index scan anyway.  I think
> it's less risky to clean the indexes. If we skip that then there will
> be a group of people will suffer from index bloat due to this, no
> matter if they realise it or not.

Oh I didn't know that.

In that case it is better to have this vacuum process indexes as well.
I have changed the patch so that it freezes tuples, but does not skip
index cleanup.

Better err on the side of caution.

> > Yes, I think that disabling this by default defeats the purpose.
> 
> Perhaps the solution to that is somewhere else then.  I can picture
> some sort of load average counters for auto-vacuum and spamming the
> logs with WARNINGs if we maintain high enough load for long enough,
> but we'd likely be better just completely overhauling the vacuum cost
> settings to be a percentage of total effort rather than some fixed
> speed.  That would allow more powerful servers to run vacuum more
> quickly and it would also run more quickly during low load periods.
> We'd just need to sample now and again how long vacuuming a series of
> page takes then sleep for a time based on how long that took. That's
> not for this patch though.

Right.


Updated patch attached.

Yours,
Laurenz Albe

Attachment

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Allowing ALTER TYPE to change storage strategy
Next
From: David Steele
Date:
Subject: Re: More tests to stress directly checksum_impl.h