Re: xid wraparound danger due to INDEX_CLEANUP false - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: xid wraparound danger due to INDEX_CLEANUP false
Date
Msg-id CA+fd4k53uVjLoO6LpatEx9GGKu=HTqk7-TkSE2TRZVX3+MJw=w@mail.gmail.com
Whole thread Raw
In response to Re: xid wraparound danger due to INDEX_CLEANUP false  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
List pgsql-hackers
On Tue, 19 May 2020 at 11:32, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Thu, 7 May 2020 at 16:26, Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Thu, 7 May 2020 at 15:40, Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Thu, 7 May 2020 at 03:28, Peter Geoghegan <pg@bowt.ie> wrote:
> > > >
> > > > On Wed, May 6, 2020 at 2:28 AM Masahiko Sawada
> > > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > > > I've attached the patch fixes this issue.
> > > > >
> > > > > With this patch, we don't skip only index cleanup phase when
> > > > > performing an aggressive vacuum. The reason why I don't skip only
> > > > > index cleanup phase is that index vacuum phase can be called multiple
> > > > > times, which takes a very long time. Since the purpose of this index
> > > > > cleanup is to process recyclable pages it's enough to do only index
> > > > > cleanup phase.
> > > >
> > > > That's only true in nbtree, though. The way that I imagined we'd want
> > > > to fix this is by putting control in each index access method. So,
> > > > we'd revise the way that amvacuumcleanup() worked -- the
> > > > amvacuumcleanup routine for each index AM would sometimes be called in
> > > > a mode that means "I don't really want you to do any cleanup, but if
> > > > you absolutely have to for your own reasons then you can". This could
> > > > be represented using something similar to
> > > > IndexVacuumInfo.analyze_only.
> > > >
> > > > This approach has an obvious disadvantage: the patch really has to
> > > > teach *every* index AM to do something with that state (most will
> > > > simply do no work). It seems logical to have the index AM control what
> > > > happens, though. This allows the logic to live inside
> > > > _bt_vacuum_needs_cleanup() in the case of nbtree, so there is only one
> > > > place where we make decisions like this.
> > > >
> > > > Most other AMs don't have this problem. GiST has a similar issue with
> > > > recyclable pages, except that it doesn't use 32-bit XIDs so it doesn't
> > > > need to care about this stuff at all. Besides, it seems like it might
> > > > be a good idea to do other basic maintenance of the index when we're
> > > > "skipping" index cleanup. We probably should always do things that
> > > > require only a small, fixed amount of work. Things like maintaining
> > > > metadata in the metapage.
> > > >
> > > > There may be practical reasons why this approach isn't suitable for
> > > > backpatch even if it is a superior approach. What do you think?
> > >
> > > I agree this idea is better. I was thinking the same approach but I
> > > was concerned about backpatching. Especially since I was thinking to
> > > add one or two fields to IndexVacuumInfo existing index AM might not
> > > work with the new VacuumInfo structure.
> >
> > It would be ok if we added these fields at the end of VacuumInfo structure?
> >
>
> I've attached WIP patch for HEAD. With this patch, the core pass
> index_cleanup to bulkdelete and vacuumcleanup callbacks so that they
> can make decision whether run vacuum or not.
>
> If the direction of this patch seems good, I'll change the patch so
> that we pass something information to these callbacks indicating
> whether this vacuum is anti-wraparound vacuum. This is necessary
> because it's enough to invoke index cleanup before XID wraparound as
> per discussion so far.
>

I've updated the patch so that vacuum passes is_wraparound flag to
bulkdelete and vacuumcleanup. Therefore I've added two new variables
in total: index_cleanup and is_wraparound. Index AMs can make the
decision of whether to skip bulkdelete and indexcleanup or not.

Also, I've added this item to Older Bugs so as not to forget.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: min_safe_lsn column in pg_replication_slots view
Next
From: Amit Kapila
Date:
Subject: Re: min_safe_lsn column in pg_replication_slots view