Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id CA+fd4k6BbkVc2sJuk1QS8L0uynDHeCnJhfQ9icTObpz+3LhWYg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [HACKERS] Block level parallel vacuum
List pgsql-hackers
On Wed, 13 Nov 2019 at 14:31, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Nov 13, 2019 at 9:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > Yeah, 0,2,3 and 4 sounds reasonable to me.  Earlier, Dilip also got
> > confused with option 1.
> >
>
> Let me try to summarize the discussion on this point and see if others
> have any opinion on this matter.

Thank you for summarizing.

>
> We need a way to allow IndexAm to specify whether it can participate
> in a parallel vacuum.  As we know there are two phases of
> index-vacuum, bulkdelete and vacuumcleanup and in many cases, the
> bulkdelete performs the main deletion work and then vacuumcleanup just
> returns index statistics. So, for such cases, we don't want the second
> phase to be performed by a parallel vacuum worker.  Now, if the
> bulkdelete phase is not performed, then vacuumcleanup can process the
> entire index in which case it is better to do that phase via parallel
> worker.
>
> OTOH, in some cases vacuumcleanup takes another pass over-index to
> reclaim empty pages and update record the same in FSM even if
> bulkdelete is performed.  This happens in gin and bloom indexes.
> Then, we have an index where we do all the work in cleanup phase like
> in the case of brin indexes.  Now, for this category of indexes, we
> want vacuumcleanup phase to be also performed by a parallel worker.
>
> In short different indexes have different requirements for which phase
> of index vacuum can be performed in parallel.  Just to be clear, we
> can't perform both the phases (bulkdelete and cleanup) in one-go as
> bulk-delete can happen multiple times on a large index whereas
> vacuumcleanup is done once at the end.
>
> Based on these needs, we came up with a way to allow users to specify
> this information for IndexAm's. Basically, Indexam will expose a
> variable amparallelvacuumoptions which can have below options
>
> VACUUM_OPTION_NO_PARALLEL   1 << 0 # vacuum (neither bulkdelete nor
> vacuumcleanup) can't be performed in parallel

I think VACUUM_OPTION_NO_PARALLEL can be 0 so that index AMs who don't
want to support parallel vacuum don't have to set anything.

> VACUUM_OPTION_PARALLEL_BULKDEL   1 << 1 # bulkdelete can be done in
> parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
> flag)
> VACUUM_OPTION_PARALLEL_COND_CLEANUP  1 << 2 # vacuumcleanup can be
> done in parallel if bulkdelete is not performed (Indexes nbtree, brin,
> gin, gist,
> spgist, bloom will set this flag)
> VACUUM_OPTION_PARALLEL_CLEANUP  1 << 3 # vacuumcleanup can be done in
> parallel even if bulkdelete is already performed (Indexes gin, brin,
> and bloom will set this flag)

I think gin and bloom don't need to set both but should set only
VACUUM_OPTION_PARALLEL_CLEANUP.

And I'm going to disallow index AMs to set both
VACUUM_OPTION_PARALLEL_COND_CLEANUP and VACUUM_OPTION_PARALLEL_CLEANUP
by assertions, is that okay?

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: dropdb --force
Next
From: Noah Misch
Date:
Subject: Re: SimpleLruTruncate() mutual exclusion