Re: decoupling table and index vacuum - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: decoupling table and index vacuum
Date
Msg-id CAFiTN-uZWTVUeEjB5Nm=pHSvO6hsB3EAY_Q98j5FG96TpG+OXg@mail.gmail.com
Whole thread Raw
In response to Re: decoupling table and index vacuum  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Tue, Feb 8, 2022 at 10:42 PM Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Sun, Feb 6, 2022 at 11:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > One thing we could try doing in order to make that easier would be:
> > > tweak things so that when autovacuum vacuums the table, it only
> > > vacuums the indexes if they meet some threshold for bloat. I'm not
> > > sure exactly what happens with the heap vacuuming then - do we do
> > > phases 1 and 2 always, or a combined heap pass, or what? But if we
> > > pick some criteria that vacuums indexes sometimes and not other times,
> > > we can probably start doing some meaningful measurement of whether
> > > this patch is making bloat better or worse, and whether it's using
> > > fewer or more resources to do it.
> >
> > I think we can always trigger phase 1 and 2 and phase 2 will only
> > vacuum conditionally based on if all the indexes are vacuumed for some
> > conveyor belt pages so we don't have risk of scanning without marking
> > anything unused.
>
> Not sure what you mean about a risk of scanning without marking any
> LP_DEAD items as LP_UNUSED.

I mean for testing purposes if we integrate with autovacuum such that,
1) always do the first pass of the vacuum 2) index vacuum will be done
only for the indexes which have bloated more than some threshold and
then 3) we can always trigger the heap vacuum second pass.  So my
point was even if from autovacuum we trigger the second vacuum pass
every time it will not do anything if all the indexes are not
vacuumed.

If VACUUM always does some amount of this,
> then it follows that the new mechanism added by the patch just can't
> safely avoid any work at all, making it all pointless. We have to
> expect heap vacuuming to take place much less often with the patch.
> Simply because that's what the invariant described in comments above
> lazy_scan_heap() requires.

In the second pass we are making sure that we don't mark any LP_DEAD
to LP_UNUSED for which index vacuum is not done.  Basically we are
storing dead items in the conveyor belt and whenever we do the index
pass we remember upto which conveyor belt page index vacuum is done.
And before starting the heap second pass we will find the minimum
conveyor belt page upto which all the indexes have been vacuumed.

> Note that this is not the same thing as saying that we do less
> *absolute* heap vacuuming with the conveyor belt -- my statement about
> less heap vacuuming taking place is *only* true relative to the amount
> of other work that happens in any individual "shortened" VACUUM
> operation. We could do exactly the same total amount of heap vacuuming
> as before (in a version of Postgres without the conveyor belt but with
> the same settings), but much *more* index vacuuming (at least for one
> or two problematic indexes).
>
> > And we can try to measure with other approaches as
> > well where we completely avoid phase 2 and it will be done only along
> > with phase 1 whenever applicable.
>
> I believe that the main benefit of the dead TID conveyor belt (outside
> of global index use cases) will be to enable us to do more (much more)
> index vacuuming for one index in particular. So it's not really about
> doing less index vacuuming or less heap vacuuming -- it's about doing
> a *greater* amount of *useful* index vacuuming, in less time. There is
> often some way in which failing to vacuum one index for a long time
> does lasting damage to the index structure.

I agree with the point.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pgsql: Avoid race in RelationBuildDesc() affecting CREATE INDEX CONCURR
Next
From: Dilip Kumar
Date:
Subject: Re: decoupling table and index vacuum