Re: POC: Parallel processing of indexes in autovacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: POC: Parallel processing of indexes in autovacuum
Date
Msg-id CAD21AoD6HhraqhOgkQJOrr0ixZkAZuqJRpzGv-B+_-ad6d5aPw@mail.gmail.com
Whole thread Raw
In response to Re: POC: Parallel processing of indexes in autovacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Thu, Aug 7, 2025 at 4:38 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jul 21, 2025 at 11:45 PM Daniil Davydov <3danissimo@gmail.com> wrote:
> >
> > Hi,
> >
> > On Mon, Jul 21, 2025 at 11:40 PM Sami Imseih <samimseih@gmail.com> wrote:
> > >
> > > I have only reviewed the v8-0001-Parallel-index-autovacuum.patch so far and
> > > have a few comments from my initial pass.
> > >
> > > 1/ Please run pgindent.
> >
> > OK, I'll do it.
> >
> > > 2/ Documentation is missing. There may be more, but here are the places I
> > > found that likely need updates for the new behavior, reloptions, GUC, etc.
> > > Including docs in the patch early would help clarify expected behavior.
> > >
> > > https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-BASICS
> > > https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
> > > https://www.postgresql.org/docs/current/runtime-config-autovacuum.html
> > > https://www.postgresql.org/docs/current/sql-createtable.html
> > > https://www.postgresql.org/docs/current/sql-altertable.html
> > > https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES
> > > https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WORKERS
> > >
> >
> > Thanks for gathering it all together. I'll update the documentation so
> > it will reflect changes in autovacuum daemon, reloptions and GUC
> > parameters. So far, I don't see what we can add to vacuum-basics
> > and alter-table paragraphs.
> >
> > I'll create separate .patch file for changes in documentation.
> >
> > > One thing I am unclear on is the interaction between max_worker_processes,
> > > max_parallel_workers, and max_parallel_maintenance_workers. For example, does
> > > the following change mean that manual VACUUM PARALLEL is no longer capped by
> > > max_parallel_maintenance_workers?
> > >
> > > @@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels,
> > > int nindexes, int nrequested,
> > >         parallel_workers = (nrequested > 0) ?
> > >                 Min(nrequested, nindexes_parallel) : nindexes_parallel;
> > >
> > > -       /* Cap by max_parallel_maintenance_workers */
> > > -       parallel_workers = Min(parallel_workers,
> > > max_parallel_maintenance_workers);
> > > +       /* Cap by GUC variable */
> > > +       parallel_workers = Min(parallel_workers, max_parallel_workers);
> > >
> >
> > Oh, it is my poor choice of a name for a local variable (I'll rename it).
> > This variable can get different values depending on performed operation :
> > autovacuum_max_parallel_workers for parallel autovacuum and
> > max_parallel_maintenance_workers for maintenance VACUUM.
> >
> > >
> > > 3/ Shouldn't this be "max_parallel_workers" instead of "bgworkers pool" ?
> > >
> > > +                       "autovacuum_parallel_workers",
> > > +                       "Maximum number of parallel autovacuum workers
> > > that can be taken from bgworkers pool for processing this table. "
> > >
> >
> > I don't think that we should refer to max_parallel_workers here.
> > Actually, this reloption doesn't depend on max_parallel_workers at all.
> > I wrote about bgworkers pool (both here and in description of
> > autovacuum_max_parallel_workers parameter) in order to clarify that
> > parallel  autovacuum will use dynamic workers  instead of launching
> > more a/v workers.
> >
> > BTW, I don't really like that the comment on this option turns out to be
> > very large. I'll leave only short description in reloptions.c and move
> > clarification about zero value in rel.h
> > Mentions of bgworkers pool will remain only in
> > description of autovacuum_max_parallel_workers.
> >
> > > 4/ The comment "When parallel autovacuum worker die" suggests an abnormal
> > > exit. "Terminates" seems clearer, since this applies to both normal and
> > > abnormal exits.
> > >
> > > instead of:
> > > + * When parallel autovacuum worker die,
> > >
> > > how about this:
> > > * When parallel autovacuum worker terminates,
> > >
> >
> > Sounds reasonable, I'll fix it.
> >
> > >
> > > 5/ Any reason AutoVacuumReleaseParallelWorkers cannot be called before
> > > DestroyParallelContext?
> > >
> > > +       nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember
> > > this value */
> > >         DestroyParallelContext(pvs->pcxt);
> > > +
> > > +       /* Release all launched (i.e. reserved) parallel autovacuum workers. */
> > > +       if (AmAutoVacuumWorkerProcess())
> > > +               AutoVacuumReleaseParallelWorkers(nlaunched_workers);
> > >
> >
> > I wrote about it above [1], but I think I can duplicate my thoughts here :
> > """
> > Destroying parallel context includes waiting for all workers to exit (after
> > which, other operations can use them).
> > If we first call ParallelAutoVacuumReleaseWorkers, some operation can
> > reasonably request all released workers. But this request can fail,
> > because there is no guarantee that workers managed to finish.
> >
> > Actually, there's nothing wrong with that, but I think releasing workers
> > only after finishing work is a more logical approach.
> > """
> >
> > >
> > > 6/ Also, would it be cleaner to move AmAutoVacuumWorkerProcess() inside
> > > AutoVacuumReleaseParallelWorkers()?
> > >
> > > if (!AmAutoVacuumWorkerProcess())
> > >         return;
> > >
> >
> > It seems to me that the opposite is true. If there is no alternative to calling
> > AmAutoVacuumWorkerProcess, it might confuse somebody. All doubts
> > will disappear after viewing the AmAutoVacuumWorkerProcess code,
> > but IMO code in vacuumparallel.c will become less intuitive.
> >
> > > 7/ It looks like the psql tab completion for autovacuum_parallel_workers is
> > > missing:
> > >
> > > test=# alter table t set (autovacuum_
> > > autovacuum_analyze_scale_factor
> > > autovacuum_analyze_threshold
> > > autovacuum_enabled
> > > autovacuum_freeze_max_age
> > > autovacuum_freeze_min_age
> > > autovacuum_freeze_table_age
> > > autovacuum_multixact_freeze_max_age
> > > autovacuum_multixact_freeze_min_age
> > > autovacuum_multixact_freeze_table_age
> > > autovacuum_vacuum_cost_delay
> > > autovacuum_vacuum_cost_limit
> > > autovacuum_vacuum_insert_scale_factor
> > > autovacuum_vacuum_insert_threshold
> > > autovacuum_vacuum_max_threshold
> > > autovacuum_vacuum_scale_factor
> > > autovacuum_vacuum_threshold
> > >
> >
> > Good catch, I'll fix it.
> >
> > Thank you for the review! Please, see v9 patches :
> > 1) Run pgindent + rebase patches on newest commit in master.
> > 2) Introduce changes for documentation.
> > 3) Rename local variable in parallel_vacuum_compute_workers.
> > 4) Shorten the description of autovacuum_parallel_workers in
> >     reloptions.c (move clarifications for it into rel.h).
> > 5) Reword "When parallel autovacuum worker die" comment.
> > 6) Add tab completion for autovacuum_parallel_workers table option.
>
> Thank you for updating the patch. Here are some review comments.
>
> +        /* Release all launched (i.e. reserved) parallel autovacuum workers. */
> +        if (AmAutoVacuumWorkerProcess())
> +                AutoVacuumReleaseParallelWorkers(nlaunched_workers);
> +
>
> We release the reserved worker in parallel_vacuum_end(). However,
> parallel_vacuum_end() is called only once at the end of vacuum. I
> think we need to release the reserved worker after index vacuuming or
> cleanup, otherwise we would end up holding the reserved workers until
> the end of vacuum even if we invoke index vacuuming multiple times.
>
> ---
> +void
> +assign_autovacuum_max_parallel_workers(int newval, void *extra)
> +{
> +        autovacuum_max_parallel_workers = Min(newval, max_worker_processes);
> +}
>
> I don't think we need the assign hook for this GUC parameter. We can
> internally cap the maximum value by max_worker_processes like other
> GUC parameters such as max_parallel_maintenance_workers and
> max_parallel_workers.
>
> ---+        /* Refresh autovacuum_max_parallel_workers paremeter */
> +        CHECK_FOR_INTERRUPTS();
> +        if (ConfigReloadPending)
> +        {
> +                ConfigReloadPending = false;
> +                ProcessConfigFile(PGC_SIGHUP);
> +        }
> +
> +        LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
> +
> +        /*
> +         * If autovacuum_max_parallel_workers parameter was reduced during
> +         * parallel autovacuum execution, we must cap available
> workers number by
> +         * its new value.
> +         */
> +        AutoVacuumShmem->av_freeParallelWorkers =
> +                Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
> +                        autovacuum_max_parallel_workers);
> +
> +        LWLockRelease(AutovacuumLock);
>
> I think another race condition could occur; suppose
> autovacuum_max_parallel_workers is set to '5' and one autovacuum
> worker reserved 5 workers, meaning that
> AutoVacuumShmem->av_freeParallelWorkers is 0. Then, the user changes
> autovacuum_max_parallel_workers to 3 and reloads the conf file right
> after the autovacuum worker checks the interruption. The launcher
> processes calls adjust_free_parallel_workers() but
> av_freeParallelWorkers remains 0, and the autovacuum worker increments
> it by 5 as its autovacuum_max_parallel_workers value is still 5.
>
> I think that we can have the autovacuum_max_parallel_workers value on
> shmem, and only the launcher process can modify its value if the GUC
> is changed. Autovacuum workers simply increase or decrease the
> av_freeParallelWorkers within the range of 0 and the
> autovacuum_max_parallel_workers value on shmem. When changing
> autovacuum_max_parallel_workers and av_freeParallelWorkers values on
> shmem, the launcher process calculates the number of workers reserved
> at that time and calculate the new av_freeParallelWorkers value by
> subtracting the new autovacuum_max_parallel_workers by the number of
> reserved workers.
>
> ---
> +AutoVacuumReserveParallelWorkers(int nworkers)
> +{
> +   int         can_launch;
>
> How about renaming it to 'nreserved' or something? can_launch looks
> like it's a boolean variable to indicate whether the process can
> launch workers.

While testing the patch, I found there are other two problems:

1. when an autovacuum worker who reserved workers fails with an error,
the reserved workers are not released. I think we need to ensure that
all reserved workers are surely released at the end of vacuum even
with an error.

2. when an autovacuum worker (not parallel vacuum worker) who uses
parallel vacuum gets SIGHUP, it errors out with the error message
"parameter "max_stack_depth" cannot be set during a parallel
operation". Autovacuum checks the configuration file reload in
vacuum_delay_point(), and while reloading the configuration file, it
attempts to set max_stack_depth in
InitializeGUCOptionsFromEnvironment() (which is called by
ProcessConfigFileInternal()). However, it cannot change
max_stack_depth since the worker is in parallel mode but
max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't
happen in regular backends who are using parallel queries because they
check the configuration file reload at the end of each SQL command.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: "Peter Geoghegan"
Date:
Subject: Re: index prefetching
Next
From: Andres Freund
Date:
Subject: Re: index prefetching