Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers
Date
Msg-id 20201005.120005.1956910020227020729.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
List pgsql-hackers
At Sat, 3 Oct 2020 22:25:14 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in 
> On Sat, 3 Oct 2020 at 20:03, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Sep 30, 2020 at 9:23 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > >
> > > On Tue, Sep 22, 2020 at 3:20 AM David Rowley <dgrowleyml@gmail.com> wrote:
> > > > It would be good if we were consistent with these parallel options.
> > > > Right now max_parallel_workers_per_gather will restrict the
> > > > parallel_workers reloption.  I'd say this
> > > > max_parallel_workers_per_gather is similar to
> > > > max_parallel_maintenance_workers here and the PARALLEL vacuum option
> > > > is like the parallel_workers reloption.
> > > >
> > > > If we want VACUUM's parallel option to work the same way as that then
> > > > max_parallel_maintenance_workers should restrict whatever is mentioned
> > > > in VACUUM PARALLEL.
> > > >
> > > > Or perhaps this is slightly different as the user is explicitly asking
> > > > for this in the command, but you could likely say the same about ALTER
> > > > TABLE <table> SET (parallel_workers = N); too.
> > >
> > > There is a subtle difference between these two cases. In the case of a
> > > query, there may be multiple table scans involved, all under the same
> > > Gather node. So a limit on the Gather node is to some degree a
> > > separate constraint on the overall query plan from the reloption
> > > applied to a particular table. So there is at least some kind of an
> > > argument that it's sensible to combine those limits somehow. I'm not
> > > sure I believe it, though. The user probably wants exactly the number
> > > of workers they specify, not the GUC value.
> > >
> > > However, in the VACUUM case, there's no possibility of distinguishing
> > > between the parallel operation as a whole and the expectations for a
> > > particular table. It's a single operation.
> > >
> >
> >
> > But the same is true for the 'Create Index' operation as well where we
> > follow the same thing. We will use the number of workers as specified
> > in reloption (parallel_workers) which is then limited by
> > max_parallel_maintenance_workers.
> 
> Both opinions have a valid point.

I think the purpose of the variable is to cap the number of workers
that the system *automatically determines*. It seems reasolable to
ignore the limit as far as it is commanded by a super user. But I
don't think a non-superuser doesn't have such a pvigilege. On the
other hand I'm not sure whether it's the right thing to allow super
users to exhaust the reserved capacity and whether it's worth that
complexity.

> To make the behavior of parallel vacuum more consistent with other
> parallel maintenance commands (i.g., only parallel INDEX CREATE for
> now), as a second idea, can we make use of parallel_workers reloption
> in parallel vacuum case as well? That is, when PARALLEL option without

The varialble is thougt as the number of workers for paralle-scan of
create index. It is totally different characteristcs from that for
parallel vacuum.  If we had parallel_maintenance_workers, it'd be
usable for vacuum, but I don't want to add that reloption too much..

> an integer is specified or VACUUM command without PARALLEL option, the
> parallel degree is the number of indexes that support parallel vacuum
> and are bigger than min_parallel_index_scan_size. If the
> parallel_workers reloption of the table is set we use it instead. In
> both cases, the parallel degree is capped by
> max_parallel_maintenance_workers. OTOH when PARALLEL option with an
> integer is specified, the parallel degree is the specified integer
> value and it's capped by max_parallel_workers and the number of
> indexes that support parallel vacuum and are bigger than
> min_parallel_index_scan_size.
> 
> That way the default behavior and the behavior of PARALLEL option
> without an integer is similar to parallel CREATE INDEX. In addition to
> it, VACUUM command has an additional way to control the parallel
> degree beyond max_parallel_maintenance_workers limit by using the
> command option.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers
Next
From: Amit Kapila
Date:
Subject: Re: Logical Replication - detail message with names of missing columns