Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id CAA4eK1KbwHHj5bv1Ua=qL6komVjvXQ21CVo7OZSroRHp3FTs0A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: [HACKERS] Block level parallel vacuum  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Fri, Jan 17, 2020 at 11:00 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Jan 17, 2020 at 10:44 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jan 17, 2020 at 9:36 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > I have few small comments.
> > >
> > > 1.
> > > logical streaming for large in-progress transactions+
> > > + /* Can't perform vacuum in parallel */
> > > + if (parallel_workers <= 0)
> > > + {
> > > + pfree(can_parallel_vacuum);
> > > + return lps;
> > > + }
> > >
> > > why are we checking parallel_workers <= 0, Function
> > > compute_parallel_vacuum_workers only returns 0 or greater than 0
> > > so isn't it better to just check if (parallel_workers == 0) ?
> > >
> >
> > Why to have such an assumption about
> > compute_parallel_vacuum_workers()?  The function
> > compute_parallel_vacuum_workers() returns int, so such a check
> > (<= 0) seems reasonable to me.
>
> Okay so I should probably change my statement to why
> compute_parallel_vacuum_workers is returning "int" instead of uint?
>

Hmm, I think the number of workers at most places is int, so it is
better to return int here which will keep it consistent with how we do
at other places.  See, the similar usage in compute_parallel_worker.

  I
> mean when this function is designed to return 0 or more worker why to
> make it return int and then handle extra values on caller.  Am I
> missing something, can it really return negative in some cases?
>
> I find the below code in "compute_parallel_vacuum_workers" a bit confusing
>
> +static int
> +compute_parallel_vacuum_workers(Relation *Irel, int nindexes, int nrequested,
> + bool *can_parallel_vacuum)
> +{
> ......
> + /* The leader process takes one index */
> + nindexes_parallel--;        --> nindexes_parallel can become -1
> +
> + /* No index supports parallel vacuum */
> + if (nindexes_parallel == 0) .  -> Now if it is 0 then return 0 but
> if its -1 then continue. seems strange no?  I think here itself we can
> handle if (nindexes_parallel <= 0), that will make code cleaner.
> + return 0;
> +

I think this got recently introduce by one of my changes based on the
comment by Mahendra, we can adjust this check.

> > >
> > > I don't like the idea of first initializing the
> > > VacuumSharedCostBalance with lps->lvshared->cost_balance and then
> > > uninitialize if nworkers_launched is 0.
> > > I am not sure why do we need to initialize VacuumSharedCostBalance
> > > here? just to perform pg_atomic_write_u32(VacuumSharedCostBalance,
> > > VacuumCostBalance);?
> > > I think we can initialize it only if nworkers_launched > 0 then we can
> > > get rid of the else branch completely.
> > >
> >
> > No, we can't initialize after nworkers_launched > 0 because by that
> > time some workers would have already tried to access the shared cost
> > balance.  So, it needs to be done before launching the workers as is
> > done in code.  We can probably add a comment.
> I don't think so, VacuumSharedCostBalance is a process local which is
> just pointing to the shared memory variable right?
>
> and each process has to point it to the shared memory and that we are
> already doing in parallel_vacuum_main.  So we can initialize it after
> worker is launched.
> Basically code will look like below
>
> pg_atomic_write_u32(&(lps->lvshared->cost_balance), VacuumCostBalance);
> pg_atomic_write_u32(&(lps->lvshared->active_nworkers), 0);
>

oh, I thought you were telling to initialize the shared memory itself
after launching the workers.  However, you are asking to change the
usage of the local variable, I think we can do that.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Unnecessary delay in streaming replication due to replay lag
Next
From: Dilip Kumar
Date:
Subject: Re: [HACKERS] Block level parallel vacuum