Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id CAA4eK1JPgB9wFWsytRPWZ8-RT+WZc3Xs6Un7yQJ0+oAz6NP2sw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Responses Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
List pgsql-hackers
On Wed, Nov 20, 2019 at 4:04 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Wed, 20 Nov 2019 at 17:02, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Nov 20, 2019 at 11:01 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > I've attached the latest version patch set. The patch set includes all
> > > discussed points regarding index AM options as well as shared cost
> > > balance. Also I added some test cases used all types of index AM.
> > >
> > > During developments I had one concern about the number of parallel
> > > workers to launch. In current design each index AMs can choose the
> > > participation of parallel bulk-deletion and parallel cleanup. That
> > > also means the number of parallel worker to launch might be different
> > > for each time of parallel bulk-deletion and parallel cleanup. In
> > > current patch the leader will always launch the number of indexes that
> > > support either one but it would not be efficient in some cases. For
> > > example, if we have 3 indexes supporting only parallel bulk-deletion
> > > and 2 indexes supporting only parallel index cleanup, we would launch
> > > 5 workers for each execution but some workers will do nothing at all.
> > > To deal with this problem, I wonder if we can improve the parallel
> > > query so that the leader process creates a parallel context with the
> > > maximum number of indexes and can launch a part of workers instead of
> > > all of them.
> > >
> >
> > Can't we choose the number of workers as a maximum of
> > "num_of_indexes_that_support_bulk_del" and
> > "num_of_indexes_that_support_cleanup"?  If we can do that, then we can
> > always launch the required number of workers for each phase (bulk_del,
> > cleanup).  In your above example, it should choose 3 workers while
> > creating a parallel context.  Do you see any problem with that?
>
> I might be missing something but if we create the parallel context
> with 3 workers the leader process always launches 3 workers. Therefore
> in the above case it launches 3 workers even in cleanup although 2
> workers is enough.
>

Right, so we can either extend parallel API to launch fewer workers
than it has in parallel context as suggested by you or we can use
separate parallel context for each phase.  Going with the earlier has
the benefit that we don't need to recreate the parallel context and
the latter has the advantage that we won't keep additional shared
memory allocated.  BTW, what kind of API change you have in mind for
the approach you are suggesting?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Should we add xid_current() or a int8->xid cast?
Next
From: Amit Kapila
Date:
Subject: Re: dropdb --force