Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers |
Date | |
Msg-id | 20201005.120005.1956910020227020729.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>) |
List | pgsql-hackers |
At Sat, 3 Oct 2020 22:25:14 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in > On Sat, 3 Oct 2020 at 20:03, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Wed, Sep 30, 2020 at 9:23 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > > > > On Tue, Sep 22, 2020 at 3:20 AM David Rowley <dgrowleyml@gmail.com> wrote: > > > > It would be good if we were consistent with these parallel options. > > > > Right now max_parallel_workers_per_gather will restrict the > > > > parallel_workers reloption. I'd say this > > > > max_parallel_workers_per_gather is similar to > > > > max_parallel_maintenance_workers here and the PARALLEL vacuum option > > > > is like the parallel_workers reloption. > > > > > > > > If we want VACUUM's parallel option to work the same way as that then > > > > max_parallel_maintenance_workers should restrict whatever is mentioned > > > > in VACUUM PARALLEL. > > > > > > > > Or perhaps this is slightly different as the user is explicitly asking > > > > for this in the command, but you could likely say the same about ALTER > > > > TABLE <table> SET (parallel_workers = N); too. > > > > > > There is a subtle difference between these two cases. In the case of a > > > query, there may be multiple table scans involved, all under the same > > > Gather node. So a limit on the Gather node is to some degree a > > > separate constraint on the overall query plan from the reloption > > > applied to a particular table. So there is at least some kind of an > > > argument that it's sensible to combine those limits somehow. I'm not > > > sure I believe it, though. The user probably wants exactly the number > > > of workers they specify, not the GUC value. > > > > > > However, in the VACUUM case, there's no possibility of distinguishing > > > between the parallel operation as a whole and the expectations for a > > > particular table. It's a single operation. > > > > > > > > > But the same is true for the 'Create Index' operation as well where we > > follow the same thing. We will use the number of workers as specified > > in reloption (parallel_workers) which is then limited by > > max_parallel_maintenance_workers. > > Both opinions have a valid point. I think the purpose of the variable is to cap the number of workers that the system *automatically determines*. It seems reasolable to ignore the limit as far as it is commanded by a super user. But I don't think a non-superuser doesn't have such a pvigilege. On the other hand I'm not sure whether it's the right thing to allow super users to exhaust the reserved capacity and whether it's worth that complexity. > To make the behavior of parallel vacuum more consistent with other > parallel maintenance commands (i.g., only parallel INDEX CREATE for > now), as a second idea, can we make use of parallel_workers reloption > in parallel vacuum case as well? That is, when PARALLEL option without The varialble is thougt as the number of workers for paralle-scan of create index. It is totally different characteristcs from that for parallel vacuum. If we had parallel_maintenance_workers, it'd be usable for vacuum, but I don't want to add that reloption too much.. > an integer is specified or VACUUM command without PARALLEL option, the > parallel degree is the number of indexes that support parallel vacuum > and are bigger than min_parallel_index_scan_size. If the > parallel_workers reloption of the table is set we use it instead. In > both cases, the parallel degree is capped by > max_parallel_maintenance_workers. OTOH when PARALLEL option with an > integer is specified, the parallel degree is the specified integer > value and it's capped by max_parallel_workers and the number of > indexes that support parallel vacuum and are bigger than > min_parallel_index_scan_size. > > That way the default behavior and the behavior of PARALLEL option > without an integer is similar to parallel CREATE INDEX. In addition to > it, VACUUM command has an additional way to control the parallel > degree beyond max_parallel_maintenance_workers limit by using the > command option. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: