Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id CAA4eK1LotP=oge4GjSjZtxC0EWknOAkuSDmpn_3pw=KNUDn4mA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Responses Re: [HACKERS] Block level parallel vacuum  (Dilip Kumar <dilipbalaut@gmail.com>)
Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
List pgsql-hackers
On Sat, Jan 11, 2020 at 7:48 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Sat, 11 Jan 2020 at 13:18, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, Jan 11, 2020 at 9:23 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Fri, 10 Jan 2020 at 20:54, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
> > > >
> > > > On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote:
> > > > >
> > > > > Hi
> > > > > Thank you for update! I looked again
> > > > >
> > > > > (vacuum_indexes_leader)
> > > > > +               /* Skip the indexes that can be processed by parallel workers */
> > > > > +               if (!skip_index)
> > > > > +                       continue;
> > > > >
> > > > > Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel?
> > > >
> > > > I also agree with your point.
> > >
> > > I don't think the change is a good idea.
> > >
> > > -               bool            skip_index = (get_indstats(lps->lvshared, i) == NULL ||
> > > -                                                                 skip_parallel_vacuum_index(Irel[i],
lps->lvshared));
> > > +               bool            can_parallel = (get_indstats(lps->lvshared, i) == NULL ||
> > > +                                                                       skip_parallel_vacuum_index(Irel[i],
> > > +
        lps->lvshared)); 
> > >
> > > The above condition is true when the index can *not* do parallel index vacuum. How about changing it to
skipped_indexand change the comment to something like “We are interested in only index skipped parallel vacuum”? 
> > >
> >
> > Hmm, I find the current code and comment better than what you or
> > Sergei are proposing.  I am not sure what is the point of confusion in
> > the current code?
>
> Yeah the current code is also good. I just thought they were concerned
> that the variable name skip_index might be confusing because we skip
> if skip_index is NOT true.
>

Okay, would it better if we get rid of this variable and have code like below?

/* Skip the indexes that can be processed by parallel workers */
if ( !(get_indstats(lps->lvshared, i) == NULL ||
skip_parallel_vacuum_index(Irel[i], lps->lvshared)))
    continue;
...

> >
> > > >
> > > > >
> > > > > Another question about behavior on temporary tables. Use case: the user commands just "vacuum;" to vacuum
entiredatabase (and has enough maintenance workers). Vacuum starts fine in parallel, but on first temporary table we
hit:
> > > > >
> > > > > +       if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
> > > > > +       {
> > > > > +               ereport(WARNING,
> > > > > +                               (errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum
temporarytables in parallel", 
> > > > > +                                               RelationGetRelationName(onerel))));
> > > > > +               params->nworkers = -1;
> > > > > +       }
> > > > >
> > > > > And therefore we turn off the parallel vacuum for the remaining tables... Can we improve this case?
> > > >
> > > > Good point.
> > > > Yes, we should improve this. I tried to fix this.
> > >
> > > +1
> > >
> >
> > Yeah, we can improve the situation here.  I think we don't need to
> > change the value of params->nworkers at first place if allow
> > lazy_scan_heap to take care of this.  Also, I think we shouldn't
> > display warning unless the user has explicitly asked for parallel
> > option.  See the fix in the attached patch.
>
> Agreed. But with the updated patch the PARALLEL option without the
> parallel degree doesn't display warning because params->nworkers = 0
> in that case. So how about restoring params->nworkers at the end of
> vacuum_rel()?
>

I had also thought on those lines, but I was not entirely sure about
this resetting of workers.  Today, again thinking about it, it seems
the idea Mahendra is suggesting that is giving an error if the
parallel degree is not specified seems reasonable to me.  This means
Vacuum (parallel), Vacuum (parallel) <tbl_name>, etc. will give an
error "parallel degree must be specified".  This idea has merit as now
we are supporting a parallel vacuum by default, so a 'parallel' option
without a parallel degree doesn't have any meaning.  If we do that,
then we don't need to do anything additional about the handling of
temp tables (other than what patch is already doing) as well.  What do
you think?



--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Questions/Observations related to Gist vacuum
Next
From: Dilip Kumar
Date:
Subject: Re: [HACKERS] Block level parallel vacuum