On Thu, 7 May 2020 at 03:28, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Wed, May 6, 2020 at 2:28 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> > I've attached the patch fixes this issue.
> >
> > With this patch, we don't skip only index cleanup phase when
> > performing an aggressive vacuum. The reason why I don't skip only
> > index cleanup phase is that index vacuum phase can be called multiple
> > times, which takes a very long time. Since the purpose of this index
> > cleanup is to process recyclable pages it's enough to do only index
> > cleanup phase.
>
> That's only true in nbtree, though. The way that I imagined we'd want
> to fix this is by putting control in each index access method. So,
> we'd revise the way that amvacuumcleanup() worked -- the
> amvacuumcleanup routine for each index AM would sometimes be called in
> a mode that means "I don't really want you to do any cleanup, but if
> you absolutely have to for your own reasons then you can". This could
> be represented using something similar to
> IndexVacuumInfo.analyze_only.
>
> This approach has an obvious disadvantage: the patch really has to
> teach *every* index AM to do something with that state (most will
> simply do no work). It seems logical to have the index AM control what
> happens, though. This allows the logic to live inside
> _bt_vacuum_needs_cleanup() in the case of nbtree, so there is only one
> place where we make decisions like this.
>
> Most other AMs don't have this problem. GiST has a similar issue with
> recyclable pages, except that it doesn't use 32-bit XIDs so it doesn't
> need to care about this stuff at all. Besides, it seems like it might
> be a good idea to do other basic maintenance of the index when we're
> "skipping" index cleanup. We probably should always do things that
> require only a small, fixed amount of work. Things like maintaining
> metadata in the metapage.
>
> There may be practical reasons why this approach isn't suitable for
> backpatch even if it is a superior approach. What do you think?
I agree this idea is better. I was thinking the same approach but I
was concerned about backpatching. Especially since I was thinking to
add one or two fields to IndexVacuumInfo existing index AM might not
work with the new VacuumInfo structure.
If we go with this idea, we need to change lazy vacuum so that it uses
two-pass strategy vacuum even if INDEX_CLEANUP is false. Also in
parallel vacuum, we need to launch workers. But I think these changes
are no big problem.
Regards,
--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services