Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers
Date
Msg-id CAA4eK1+gD7jAP4wqx8+wNhqpc8cM_7o2WvVBa0OVXLsgoDFHqA@mail.gmail.com
Whole thread Raw
In response to Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers
List pgsql-hackers
On Sun, Sep 20, 2020 at 7:15 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
>
> On 2020-09-19 13:24, Amit Kapila wrote:
> >> I think the implemented behavior is wrong.
> >
> > It is the same as what we do for other parallel operations, for
> > example, we limit the number of parallel workers for parallel create
> > index by 'max_parallel_maintenance_workers' and parallel scan
> > operations are limited by 'max_parallel_workers_per_gather'.
>
> But in those cases we don't provide user-visible options to specify a
> per-command setting, so it's not the same thing, is it?
>

Not exactly but there also we have a way for the user to set the value
(using 'parallel_workers' during Create Table or Alter Table) which
will guide the parallel scans.

> >>   The VACUUM PARALLEL option
> >> should override the max_parallel_maintenance_worker setting.
> >>
> >> Otherwise, what's the point of the command option?
> >
> > It is for the cases where the user has a better idea of workload. We
> > can launch only a limited number of parallel workers
> > 'max_parallel_workers' in the system, so sometimes users would like to
> > use it as per their requirement.
>
> Right, but my point is, it doesn't actually do that correctly.  I can't
> just say, oh, I have a maintenance window, I'd like to run a really fast
> VACUUM.  The PARALLEL option is capped by the setting you'd normally use
> anyway, so specifying it is useless.
>

Yeah, because by default we choose the maximum number of possible
workers for Vacuum.

> The only thing it can do right now is if you want to run a manual VACUUM
> less parallel than by default.  But I don't see how that is often useful.
>

Say when indexes that support parallel scan are not very big then we
don't need the default behavior because it will use more resources
while providing not much additional benefit.

What according to you should be the behavior here and how will it be
better than current?

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away
Next
From: Amit Kapila
Date:
Subject: Re: [HACKERS] logical decoding of two-phase transactions