Re: New IndexAM API controlling index vacuum strategies - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: New IndexAM API controlling index vacuum strategies
Date
Msg-id CAD21AoD6UgGCCYVUiTa2GbEeYgJTnSqP686Kvt0gYhagfvP9ew@mail.gmail.com
Whole thread Raw
In response to Re: New IndexAM API controlling index vacuum strategies  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: New IndexAM API controlling index vacuum strategies  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Wed, Mar 24, 2021 at 11:44 AM Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Tue, Mar 23, 2021 at 4:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > Here are review comments on 0003 patch:
>
> Attached is a new revision, v5. It fixes bit rot caused by recent
> changes (your index autovacuum logging stuff). It has also been
> cleaned up in response to your recent review comments -- both from
> this email, and the other review email that I responded to separately
> today.
>
> > +    * If we skip vacuum, we just ignore the collected dead tuples.  Note that
> > +    * vacrelstats->dead_tuples could have tuples which became dead after
> > +    * HOT-pruning but are not marked dead yet.  We do not process them
> > +    * because it's a very rare condition, and the next vacuum will process
> > +    * them anyway.
> > +    */
> >
> > The second paragraph is no longer true after removing the 'tupegone' case.
>
> Fixed.
>
> > Maybe we can use vacrelstats->num_index_scans instead of
> > calledtwopass? When calling to two_pass_strategy() at the end of
> > lazy_scan_heap(), if vacrelstats->num_index_scans is 0 it means this
> > is the first time call, which is equivalent to calledtwopass = false.
>
> It's true that when "vacrelstats->num_index_scans > 0" it definitely
> can't have been the first call. But how can we distinguish between 1.)
> the case where we're being called for the first time, and 2.) the case
> where it's the second call, but the first call actually skipped index
> vacuuming? When we skip index vacuuming we won't increment
> num_index_scans (which seems appropriate to me).

In (2) case, I think we skipped index vacuuming in the first call
because index_cleanup was disabled (if index_cleanup was not disabled,
we didn't skip it because two_pass_strategy() is called with onecall =
false). So in the second call, we skip index vacuuming for the same
reason. Even with the 0004 patch (skipping index vacuuming in
emergency cases), the check of XID wraparound emergency should be done
before the !onecall check in two_pass_strategy() since we should skip
index vacuuming in an emergency case even in the case where
maintenance_work_mem runs out. Therefore, similarly, we will skip
index vacuuming also in the second call.

That being said, I agree that using ‘calledtwopass’ is much readable.
So I’ll keep it as is.

>
> For now I have added an assertion that "vacrelstats->num_index_scan ==
> 0" at the point where we apply skipping indexes as an optimization
> (i.e. the point where the patch 0003- mechanism is applied).
>
> > Perhaps we can make INDEX_CLEANUP option a four-value option: on, off,
> > auto, and default? A problem with the above change would be that if
> > the user wants to do "auto" mode, they might need to reset
> > vacuum_index_cleanup reloption before executing VACUUM command. In
> > other words, there is no way in VACUUM command to force "auto" mode.
> > So I think we can add "auto" value to INDEX_CLEANUP option and ignore
> > the vacuum_index_cleanup reloption if that value is specified.
>
> I agree that this aspect definitely needs more work. I'll leave it to you to
> do this in a separate revision of this new 0003 patch (so no changes here
> from me for v5).
>
> > Are you updating also the 0003 patch? if you're focusing on 0001 and
> > 0002 patch, I'll update the 0003 patch along with the fourth patch
> > (skipping index vacuum in emergency cases).
>
> I suggest that you start integrating it with the wraparound emergency
> mechanism, which can become patch 0004- of the patch series. You can
> manage 0003- and 0004- now. You can post revisions of each of those
> two independently of my revisions. What do you think? I have included
> 0003- for now because you had review comments on it that I worked
> through, but you should own that, I think.
>
> I suppose that you should include the versions of 0001- and 0002- you
> worked off of, just for the convenience of others/to keep the CF
> tester happy. I don't think that I'm going to make many changes that
> will break your patch, except for obvious bit rot that can be fixed
> through fairly mechanical rebasing.

Agreed.

I was just about to post my 0004 patch based on v4 patch series. I'll
update 0003 and 0004 patches based on v5 patch series you just posted,
and post them including 0001 and 0002 patches.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: 64-bit XIDs in deleted nbtree pages
Next
From: Michael Paquier
Date:
Subject: Re: Autovacuum worker doesn't immediately exit on postmaster death