Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: [HACKERS] Block level parallel vacuum |
Date | |
Msg-id | CA+fd4k6zkR8sOWf8u33jhZP_dQW9fgnUsGdU4BZEPZW4=7=r6w@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Block level parallel vacuum (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: [HACKERS] Block level parallel vacuum
|
List | pgsql-hackers |
On Mon, 13 Jan 2020 at 12:50, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sat, Jan 11, 2020 at 7:48 PM Masahiko Sawada > <masahiko.sawada@2ndquadrant.com> wrote: > > > > On Sat, 11 Jan 2020 at 13:18, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Sat, Jan 11, 2020 at 9:23 AM Masahiko Sawada > > > <masahiko.sawada@2ndquadrant.com> wrote: > > > > > > > > On Fri, 10 Jan 2020 at 20:54, Mahendra Singh Thalor <mahi6run@gmail.com> wrote: > > > > > > > > > > On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk@zsrv.org> wrote: > > > > > > > > > > > > Hi > > > > > > Thank you for update! I looked again > > > > > > > > > > > > (vacuum_indexes_leader) > > > > > > + /* Skip the indexes that can be processed by parallel workers */ > > > > > > + if (!skip_index) > > > > > > + continue; > > > > > > > > > > > > Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel? > > > > > > > > > > I also agree with your point. > > > > > > > > I don't think the change is a good idea. > > > > > > > > - bool skip_index = (get_indstats(lps->lvshared, i) == NULL || > > > > - skip_parallel_vacuum_index(Irel[i], lps->lvshared)); > > > > + bool can_parallel = (get_indstats(lps->lvshared, i) == NULL || > > > > + skip_parallel_vacuum_index(Irel[i], > > > > + lps->lvshared)); > > > > > > > > The above condition is true when the index can *not* do parallel index vacuum. How about changing it to skipped_indexand change the comment to something like “We are interested in only index skipped parallel vacuum”? > > > > > > > > > > Hmm, I find the current code and comment better than what you or > > > Sergei are proposing. I am not sure what is the point of confusion in > > > the current code? > > > > Yeah the current code is also good. I just thought they were concerned > > that the variable name skip_index might be confusing because we skip > > if skip_index is NOT true. > > > > Okay, would it better if we get rid of this variable and have code like below? > > /* Skip the indexes that can be processed by parallel workers */ > if ( !(get_indstats(lps->lvshared, i) == NULL || > skip_parallel_vacuum_index(Irel[i], lps->lvshared))) > continue; Make sense to me. > ... > > > > > > > > > > > > > > > > > > > > > Another question about behavior on temporary tables. Use case: the user commands just "vacuum;" to vacuum entiredatabase (and has enough maintenance workers). Vacuum starts fine in parallel, but on first temporary table we hit: > > > > > > > > > > > > + if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0) > > > > > > + { > > > > > > + ereport(WARNING, > > > > > > + (errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporarytables in parallel", > > > > > > + RelationGetRelationName(onerel)))); > > > > > > + params->nworkers = -1; > > > > > > + } > > > > > > > > > > > > And therefore we turn off the parallel vacuum for the remaining tables... Can we improve this case? > > > > > > > > > > Good point. > > > > > Yes, we should improve this. I tried to fix this. > > > > > > > > +1 > > > > > > > > > > Yeah, we can improve the situation here. I think we don't need to > > > change the value of params->nworkers at first place if allow > > > lazy_scan_heap to take care of this. Also, I think we shouldn't > > > display warning unless the user has explicitly asked for parallel > > > option. See the fix in the attached patch. > > > > Agreed. But with the updated patch the PARALLEL option without the > > parallel degree doesn't display warning because params->nworkers = 0 > > in that case. So how about restoring params->nworkers at the end of > > vacuum_rel()? > > > > I had also thought on those lines, but I was not entirely sure about > this resetting of workers. Today, again thinking about it, it seems > the idea Mahendra is suggesting that is giving an error if the > parallel degree is not specified seems reasonable to me. This means > Vacuum (parallel), Vacuum (parallel) <tbl_name>, etc. will give an > error "parallel degree must be specified". This idea has merit as now > we are supporting a parallel vacuum by default, so a 'parallel' option > without a parallel degree doesn't have any meaning. If we do that, > then we don't need to do anything additional about the handling of > temp tables (other than what patch is already doing) as well. What do > you think? > Good point! Agreed. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: