Re: New IndexAM API controlling index vacuum strategies - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: New IndexAM API controlling index vacuum strategies |
Date | |
Msg-id | CAD21AoD=bXFAi76cHWxDE5QgNXTV_--83UOPvO4xccMiNaOTjg@mail.gmail.com Whole thread Raw |
In response to | Re: New IndexAM API controlling index vacuum strategies (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: New IndexAM API controlling index vacuum strategies
|
List | pgsql-hackers |
On Wed, Mar 24, 2021 at 12:11 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Wed, Mar 24, 2021 at 11:44 AM Peter Geoghegan <pg@bowt.ie> wrote: > > > > On Tue, Mar 23, 2021 at 4:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > Here are review comments on 0003 patch: > > > > Attached is a new revision, v5. It fixes bit rot caused by recent > > changes (your index autovacuum logging stuff). It has also been > > cleaned up in response to your recent review comments -- both from > > this email, and the other review email that I responded to separately > > today. > > > > > + * If we skip vacuum, we just ignore the collected dead tuples. Note that > > > + * vacrelstats->dead_tuples could have tuples which became dead after > > > + * HOT-pruning but are not marked dead yet. We do not process them > > > + * because it's a very rare condition, and the next vacuum will process > > > + * them anyway. > > > + */ > > > > > > The second paragraph is no longer true after removing the 'tupegone' case. > > > > Fixed. > > > > > Maybe we can use vacrelstats->num_index_scans instead of > > > calledtwopass? When calling to two_pass_strategy() at the end of > > > lazy_scan_heap(), if vacrelstats->num_index_scans is 0 it means this > > > is the first time call, which is equivalent to calledtwopass = false. > > > > It's true that when "vacrelstats->num_index_scans > 0" it definitely > > can't have been the first call. But how can we distinguish between 1.) > > the case where we're being called for the first time, and 2.) the case > > where it's the second call, but the first call actually skipped index > > vacuuming? When we skip index vacuuming we won't increment > > num_index_scans (which seems appropriate to me). > > In (2) case, I think we skipped index vacuuming in the first call > because index_cleanup was disabled (if index_cleanup was not disabled, > we didn't skip it because two_pass_strategy() is called with onecall = > false). So in the second call, we skip index vacuuming for the same > reason. Even with the 0004 patch (skipping index vacuuming in > emergency cases), the check of XID wraparound emergency should be done > before the !onecall check in two_pass_strategy() since we should skip > index vacuuming in an emergency case even in the case where > maintenance_work_mem runs out. Therefore, similarly, we will skip > index vacuuming also in the second call. > > That being said, I agree that using ‘calledtwopass’ is much readable. > So I’ll keep it as is. > > > > > For now I have added an assertion that "vacrelstats->num_index_scan == > > 0" at the point where we apply skipping indexes as an optimization > > (i.e. the point where the patch 0003- mechanism is applied). > > > > > Perhaps we can make INDEX_CLEANUP option a four-value option: on, off, > > > auto, and default? A problem with the above change would be that if > > > the user wants to do "auto" mode, they might need to reset > > > vacuum_index_cleanup reloption before executing VACUUM command. In > > > other words, there is no way in VACUUM command to force "auto" mode. > > > So I think we can add "auto" value to INDEX_CLEANUP option and ignore > > > the vacuum_index_cleanup reloption if that value is specified. > > > > I agree that this aspect definitely needs more work. I'll leave it to you to > > do this in a separate revision of this new 0003 patch (so no changes here > > from me for v5). > > > > > Are you updating also the 0003 patch? if you're focusing on 0001 and > > > 0002 patch, I'll update the 0003 patch along with the fourth patch > > > (skipping index vacuum in emergency cases). > > > > I suggest that you start integrating it with the wraparound emergency > > mechanism, which can become patch 0004- of the patch series. You can > > manage 0003- and 0004- now. You can post revisions of each of those > > two independently of my revisions. What do you think? I have included > > 0003- for now because you had review comments on it that I worked > > through, but you should own that, I think. > > > > I suppose that you should include the versions of 0001- and 0002- you > > worked off of, just for the convenience of others/to keep the CF > > tester happy. I don't think that I'm going to make many changes that > > will break your patch, except for obvious bit rot that can be fixed > > through fairly mechanical rebasing. > > Agreed. > > I was just about to post my 0004 patch based on v4 patch series. I'll > update 0003 and 0004 patches based on v5 patch series you just posted, > and post them including 0001 and 0002 patches. > I've attached the updated patch set (nothing changed in 0001 and 0002 patch). Regarding "auto" option, I think it would be a good start to enable the index vacuum skipping behavior by default instead of adding “auto” mode. That is, we could skip index vacuuming if INDEX_CLEANUP ON. With 0003 and 0004 patch, there are two cases where we skip index vacuuming: the garbage on heap is very concentrated and the table is at risk of XID wraparound. It seems to make sense to have both behaviors by default. If we want to have a way to force doing index vacuuming, we can add “force” option instead of adding “auto” option and having “on” mode force doing index vacuuming. Also regarding new GUC parameters, vacuum_skip_index_age and vacuum_multixact_skip_index_age, those are not autovacuum-dedicated parameter. VACUUM command also uses those parameters to skip index vacuuming dynamically. In such an emergency case, it seems appropriate to me to skip index vacuuming even in VACUUM command. And I don’t add any reloption for those two parameters. Since those parameters are unlikely to be changed from the default value, I think it don’t necessarily need to provide a way for per-table configuration. In 0001 patch, we have the following chunk: + bool skipping; + + /* Should not end up here with no indexes */ + Assert(nindexes > 0); + Assert(!IsParallelWorker()); + + /* Check whether or not to do index vacuum and heap vacuum */ + if (index_cleanup == VACOPT_TERNARY_DISABLED) + skipping = true; + else + skipping = false; Can we flip the boolean? I mean to use a positive form such as "do_vacuum". It seems to be more readable especially for the changes made in 0003 and 0004 patches. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Attachment
pgsql-hackers by date: