Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id CAA4eK1KfAXWNWtJi+gj9Yehr4tMptKk1dM+eOE=cBQqj3yeR4Q@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Responses Re: [HACKERS] Block level parallel vacuum  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Tue, Nov 12, 2019 at 7:43 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Mon, 11 Nov 2019 at 19:29, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Nov 11, 2019 at 12:26 PM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > After more thoughts, I think we can have a ternary value: never,
> > > always, once. If it's 'never' the index never participates in parallel
> > > cleanup. I guess hash indexes use 'never'. Next, if it's 'always' the
> > > index always participates regardless of vacrelstats->num_index_scan. I
> > > guess gin, brin and bloom use 'always'. Finally if it's 'once' the
> > > index participates in parallel cleanup only when it's the first time
> > > (that is, vacrelstats->num_index_scan == 0), I guess btree, gist and
> > > spgist use 'once'.
> > >
> >
> > I think this 'once' option is confusing especially because it also
> > depends on 'num_index_scans' which the IndexAM has no control over.
> > It might be that the option name is not good, but I am not sure.
> > Another thing is that for brin indexes, we don't want bulkdelete to
> > participate in parallelism.
>
> I thought brin should set amcanparallelvacuum is false and
> amcanparallelcleanup is 'always'.
>

In that case, it is better to name the variable as amcanparallelbulkdelete.

> > Do we want to have separate variables for
> > ambulkdelete and amvacuumcleanup which decides whether the particular
> > phase can be done in parallel?
>
> You mean adding variables to ambulkdelete and amvacuumcleanup as
> function arguments?
>

No, I mean separate variables amcanparallelbulkdelete (bool) and
amcanparallelvacuumcleanup (unit16) variables.

>
> > Another possibility could be to just
> > have one variable (say uint16 amparallelvacuum) which will tell us all
> > the options but I don't think that will be a popular approach
> > considering all the other methods and variables exposed.  What do you
> > think?
>
> Adding only one variable that can have flags would also be a good
> idea, instead of having multiple variables for each option. For
> instance FDW API uses such interface (see eflags of BeginForeignScan).
>

Yeah, maybe something like amparallelvacuumoptions.  The options can be:

VACUUM_OPTION_NO_PARALLEL   0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel
VACUUM_OPTION_NO_PARALLEL_CLEANUP  1 # vacuumcleanup cannot be
performed in parallel (hash index will set this flag)
VACUUM_OPTION_PARALLEL_BULKDEL   2 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP  3 # vacuumcleanup can be done in
parallel if bulkdelete is not performed (Indexes nbtree, brin, hash,
gin, gist, spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP  4 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

Does something like this make sense?   If we all agree on this, then I
think we can summarize the part of the discussion related to this API
and get feedback from a broader audience.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: pg_waldump and PREPARE
Next
From: Michael Paquier
Date:
Subject: Re: Monitoring disk space from within the server