Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id CAD21AoBnMWohYAV0AK7aF+Z_yGqMH2bpKDbnrPDAU8tYxPuQgQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: [HACKERS] Block level parallel vacuum
List pgsql-hackers
On Mon, Mar 4, 2019 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, Mar 2, 2019 at 3:54 AM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > On Fri, Mar 1, 2019 at 12:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > I wonder if we really want this behavior.  Should a setting that
> > > > controls the degree of parallelism when scanning the table also affect
> > > > VACUUM?  I tend to think that we probably don't ever want VACUUM of a
> > > > table to be parallel by default, but rather something that the user
> > > > must explicitly request.  Happy to hear other opinions.  If we do want
> > > > this behavior, I think this should be written differently, something
> > > > like this: The PARALLEL N option to VACUUM takes precedence over this
> > > > option.
> > >
> > > For example, I can imagine a use case where a batch job does parallel
> > > vacuum to some tables in a maintenance window. The batch operation
> > > will need to compute and specify the degree of parallelism every time
> > > according to for instance the number of indexes, which would be
> > > troublesome. But if we can set the degree of parallelism for each
> > > tables it can just to do 'VACUUM (PARALLEL)'.
> >
> > True, but the setting in question would also affect the behavior of
> > sequential scans and index scans.  TBH, I'm not sure that the
> > parallel_workers reloption is really a great design as it is: is
> > hard-coding the number of workers really what people want?  Do they
> > really want the same degree of parallelism for sequential scans and
> > index scans?  Why should they want the same degree of parallelism also
> > for VACUUM?  Maybe they do, and maybe somebody explain why they do,
> > but as of now, it's not obvious to me why that should be true.
>
> I think that there are users who want to specify the degree of
> parallelism. I think that hard-coding the number of workers would be
> good design for something like VACUUM which is a simple operation for
> single object; since there are no joins, aggregations it'd be
> relatively easy to compute it. That's why the patch introduces
> PARALLEL N option as well. I think that a reloption for parallel
> vacuum would be just a way to save the degree of parallelism. And I
> agree that users don't want to use same degree of parallelism for
> VACUUM, so maybe it'd better to add new reloption like
> parallel_vacuum_workers. On the other hand, it can be a separate
> patch, I can remove the reloption part from this patch and will
> propose it when there are requests.
>

Okay, attached the latest version of patch set. I've incorporated all
comments I got and separated the patch for making vacuum options a
Node (0001 patch). And the patch doesn't use parallel_workers. It
might be proposed in the another form again in the future if
requested.


Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: Update does not move row across foreign partitions in v11
Next
From: Tatsuro Yamada
Date:
Subject: Re: [HACKERS] CLUSTER command progress monitor