Re: Berserk Autovacuum (let's save next Mandrill) - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Berserk Autovacuum (let's save next Mandrill)
Date
Msg-id CA+fd4k4N5ixDtE-PZ-AwiwUxcNnP4B0A74qqbkcJ2CzMVdkCxg@mail.gmail.com
Whole thread Raw
In response to Re: Berserk Autovacuum (let's save next Mandrill)  (Laurenz Albe <laurenz.albe@cybertec.at>)
Responses Re: Berserk Autovacuum (let's save next Mandrill)  (Laurenz Albe <laurenz.albe@cybertec.at>)
List pgsql-hackers
On Wed, 11 Mar 2020 at 04:17, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> On Tue, 2020-03-10 at 18:14 +0900, Masahiko Sawada wrote:
>
> Thanks for the review and your thoughts!
>
> > FYI actually vacuum could perform index cleanup phase (i.g.
> > PROGRESS_VACUUM_PHASE_INDEX_CLEANUP phase) on a table even if it's a
> > truly INSERT-only table, depending on
> > vacuum_cleanup_index_scale_factor. Anyway, I also agree with not
> > disabling index cleanup in insert-only vacuum case, because it could
> > become not only a cause of index bloat but also a big performance
> > issue. For example, if autovacuum on a table always run without index
> > cleanup, gin index on that table will accumulate insertion tuples in
> > its pending list and will be cleaned up by a backend process while
> > inserting new tuple, not by a autovacuum process. We can disable index
> > vacuum by index_cleanup storage parameter per tables, so it would be
> > better to defer these settings to users.
>
> Thanks for the confirmation.
>
> > I have one question about this patch from architectural perspective:
> > have you considered to use autovacuum_vacuum_threshold and
> > autovacuum_vacuum_scale_factor also for this purpose? That is, we
> > compare the threshold computed by these values to not only the number
> > of dead tuples but also the number of inserted tuples. If the number
> > of dead tuples exceeds the threshold, we trigger autovacuum as usual.
> > On the other hand if the number of inserted tuples exceeds, we trigger
> > autovacuum with vacuum_freeze_min_age = 0. I'm concerned that how user
> > consider the settings of newly added two parameters. We will have in
> > total 4 parameters. Amit also was concerned about that[1].
> >
> > I think this idea also works fine. In insert-only table case, since
> > only the number of inserted tuples gets increased, only one threshold
> > (that is, threshold computed by autovacuum_vacuum_threshold and
> > autovacuum_vacuum_scale_factor) is enough to trigger autovacuum. And
> > in mostly-insert table case, in the first place, we can trigger
> > autovacuum even in current PostgreSQL, since we have some dead tuples.
> > But if we want to trigger autovacuum more frequently by the number of
> > newly inserted tuples, we can set that threshold lower while
> > considering only the number of inserted tuples.
>
> I am torn.
>
> On the one hand it would be wonderful not to have to add yet more GUCs
> to the already complicated autovacuum configuration.  It already confuses
> too many users.
>
> On the other hand that will lead to unnecessary vacuums for small
> tables.
> Worse, the progression caused by the comparatively large scale
> factor may make it vacuum large tables too seldom.
>

I might be missing your point but could you elaborate on that in what
kind of case you think this lead to unnecessary vacuums?

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Next
From: Thomas Munro
Date:
Subject: Re: Collation versioning