Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id CAA4eK1Kpp-048=SJgM=1m3gU2LLfUm-qU3utnc42sJnJJEAh_g@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Responses Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
List pgsql-hackers
On Mon, Nov 11, 2019 at 12:26 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Mon, 11 Nov 2019 at 15:06, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, Nov 11, 2019 at 9:57 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > Good point. gin and bloom do a certain heavy work during cleanup and
> > > during bulkdelete as you mentioned. Brin does it during cleanup, and
> > > hash and gist do it during bulkdelete. There are three types of index
> > > AM just inside postgres code. An idea I came up with is that we can
> > > control parallel vacuum and parallel cleanup separately.  That is,
> > > adding a variable amcanparallelcleanup and we can do parallel cleanup
> > > on only indexes of which amcanparallelcleanup is true.
> > >

This is what I mentioned in my email as a second option (whether to
expose via IndexAM).  I am not sure if we can have a new variable just
for this.

> > > IndexBulkDelete
> > > can be stored locally if both amcanparallelvacuum and
> > > amcanparallelcleanup of an index are false because only the leader
> > > process deals with such indexes. Otherwise we need to store it in DSM
> > > as always.
> > >
> > IIUC,  amcanparallelcleanup will be true for those indexes which does
> > heavy work during cleanup irrespective of whether bulkdelete is called
> > or not e.g. gin?
>
> Yes, I guess that gin and brin set amcanparallelcleanup to true (gin
> might set amcanparallevacuum to true as well).
>
> >  If so, along with an amcanparallelcleanup flag, we
> > need to consider vacrelstats->num_index_scans right? So if
> > vacrelstats->num_index_scans == 0 then we need to launch parallel
> > worker for all the indexes who support amcanparallelvacuum and if
> > vacrelstats->num_index_scans > 0 then only for those who has
> > amcanparallelcleanup as true.
>
> Yes, you're right. But this won't work fine for brin indexes who don't
> want to participate in parallel vacuum but always want to participate
> in parallel cleanup.
>
> After more thoughts, I think we can have a ternary value: never,
> always, once. If it's 'never' the index never participates in parallel
> cleanup. I guess hash indexes use 'never'. Next, if it's 'always' the
> index always participates regardless of vacrelstats->num_index_scan. I
> guess gin, brin and bloom use 'always'. Finally if it's 'once' the
> index participates in parallel cleanup only when it's the first time
> (that is, vacrelstats->num_index_scan == 0), I guess btree, gist and
> spgist use 'once'.
>

I think this 'once' option is confusing especially because it also
depends on 'num_index_scans' which the IndexAM has no control over.
It might be that the option name is not good, but I am not sure.
Another thing is that for brin indexes, we don't want bulkdelete to
participate in parallelism.  Do we want to have separate variables for
ambulkdelete and amvacuumcleanup which decides whether the particular
phase can be done in parallel?  Another possibility could be to just
have one variable (say uint16 amparallelvacuum) which will tell us all
the options but I don't think that will be a popular approach
considering all the other methods and variables exposed.  What do you
think?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Mahendra Singh
Date:
Subject: Re: [HACKERS] Block level parallel vacuum
Next
From: Amit Kapila
Date:
Subject: Re: cost based vacuum (parallel)