Re: Parallel heap vacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Parallel heap vacuum
Date
Msg-id CAD21AoCsyk61yqD0KMm05S2EbdEh1JL3Mcsx8NkWiiJM0_uBVw@mail.gmail.com
Whole thread Raw
In response to Re: Parallel heap vacuum  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Sun, Apr 6, 2025 at 10:27 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Sawada-san.
>
> I was revisiting this thread after a long time. I found most of my
> previous review comments from v11-0001 were not yet addressed. I can't
> tell if they are deliberately left out, or if they are accidentally
> overlooked. Please see the details below.

Sorry I missed to address or reply these comments.

>
> On Mon, Mar 10, 2025 at 3:05 PM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > ======
> > src/backend/access/table/tableamapi.c
> >
> > 2.
> >   Assert(routine->relation_vacuum != NULL);
> > + Assert(routine->parallel_vacuum_compute_workers != NULL);
> >   Assert(routine->scan_analyze_next_block != NULL);
> >
> > Is it better to keep these Asserts in the same order that the
> > TableAmRoutine fields are assigned (in heapam_handler.c)?
> >
>
> Not yet addressed?

Will fix.

>
> > ~~~
> >
> > 3.
> > + /*
> > + * Callbacks for parallel vacuum are also optional (except for
> > + * parallel_vacuum_compute_workers). But one callback implies presence of
> > + * the others.
> > + */
> > + Assert(((((routine->parallel_vacuum_estimate == NULL) ==
> > +   (routine->parallel_vacuum_initialize == NULL)) ==
> > + (routine->parallel_vacuum_initialize_worker == NULL)) ==
> > + (routine->parallel_vacuum_collect_dead_items == NULL)));
> >
> > /also optional/optional/
> >
>
> Not yet addressed?

Will fix.

>
> > ======
> > src/include/access/heapam.h
> >
> > +extern int heap_parallel_vacuum_compute_workers(Relation rel, int
> > nworkers_requested);
> >
> > 4.
> > wrong tab/space after 'int'.
>
> Not yet addressed?

Will fix.

>
> >
> > ======
> > src/include/access/tableam.h
> >
> > 5.
> > + /*
> > + * Compute the number of parallel workers for parallel table vacuum. The
> > + * parallel degree for parallel vacuum is further limited by
> > + * max_parallel_maintenance_workers. The function must return 0 to disable
> > + * parallel table vacuum.
> > + *
> > + * 'nworkers_requested' is a >=0 number and the requested number of
> > + * workers. This comes from the PARALLEL option. 0 means to choose the
> > + * parallel degree based on the table AM specific factors such as table
> > + * size.
> > + */
> > + int (*parallel_vacuum_compute_workers) (Relation rel,
> > + int nworkers_requested);
> >
> > The comment here says "This comes from the PARALLEL option." and "0
> > means to choose the parallel degree...". But, the PG docs [1] says "To
> > disable this feature, one can use PARALLEL option and specify parallel
> > workers as zero.".
> >
> > These two descriptions "disable this feature" (PG docs) and letting
> > the system "choose the parallel degree" (code comment) don't sound the
> > same. Should this 0001 patch update the PG documentation for the
> > effect of setting PARALLEL value zero?
>
> Not yet addressed?

It often happens that we treat a value differently when the user
inputs the value and when we use it internally. I think the comment
should follow how to use VacuumParams.nworkers internally, which is
described in the comment as follow:

    /*
     * The number of parallel vacuum workers.  0 by default which means choose
     * based on the number of indexes.  -1 indicates parallel vacuum is
     * disabled.
     */
    int         nworkers;

So it seems no problem to me.

>
> >
> > ~~~
> >
> > 6.
> > + /*
> > + * Initialize DSM space for parallel table vacuum.
> > + *
> > + * Not called if parallel table vacuum is disabled.
> > + *
> > + * Optional callback, but either all other parallel vacuum callbacks need
> > + * to exist, or neither.
> > + */
> >
> > "or neither"?
> >
> > Also, saying "all other" seems incorrect because
> > parallel_vacuum_compute_workers callback must exist event if
> > parallel_vacuum_initialize does not exist.
> >
> > IMO you meant to say "all optional", and "or none".
> >
> > SUGGESTION:
> > Optional callback. Either all optional parallel vacuum callbacks need
> > to exist, or none.
> >
> > (this same issue is repeated in multiple places).
>
> Not yet addressed?

Will fix.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Accounting for metapages in genericcostestimate()
Next
From: Masahiko Sawada
Date:
Subject: Re: Parallel heap vacuum