Re: POC: Parallel processing of indexes in autovacuum - Mailing list pgsql-hackers
From | Daniil Davydov |
---|---|
Subject | Re: POC: Parallel processing of indexes in autovacuum |
Date | |
Msg-id | CAJDiXgin1TXniVGJKzOTA=F9K342uVfm6O0EmubTVB=F+XSrbA@mail.gmail.com Whole thread Raw |
In response to | Re: POC: Parallel processing of indexes in autovacuum (Sami Imseih <samimseih@gmail.com>) |
List | pgsql-hackers |
Hi, On Mon, Jul 21, 2025 at 11:40 PM Sami Imseih <samimseih@gmail.com> wrote: > > I have only reviewed the v8-0001-Parallel-index-autovacuum.patch so far and > have a few comments from my initial pass. > > 1/ Please run pgindent. OK, I'll do it. > 2/ Documentation is missing. There may be more, but here are the places I > found that likely need updates for the new behavior, reloptions, GUC, etc. > Including docs in the patch early would help clarify expected behavior. > > https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-BASICS > https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM > https://www.postgresql.org/docs/current/runtime-config-autovacuum.html > https://www.postgresql.org/docs/current/sql-createtable.html > https://www.postgresql.org/docs/current/sql-altertable.html > https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES > https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WORKERS > Thanks for gathering it all together. I'll update the documentation so it will reflect changes in autovacuum daemon, reloptions and GUC parameters. So far, I don't see what we can add to vacuum-basics and alter-table paragraphs. I'll create separate .patch file for changes in documentation. > One thing I am unclear on is the interaction between max_worker_processes, > max_parallel_workers, and max_parallel_maintenance_workers. For example, does > the following change mean that manual VACUUM PARALLEL is no longer capped by > max_parallel_maintenance_workers? > > @@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels, > int nindexes, int nrequested, > parallel_workers = (nrequested > 0) ? > Min(nrequested, nindexes_parallel) : nindexes_parallel; > > - /* Cap by max_parallel_maintenance_workers */ > - parallel_workers = Min(parallel_workers, > max_parallel_maintenance_workers); > + /* Cap by GUC variable */ > + parallel_workers = Min(parallel_workers, max_parallel_workers); > Oh, it is my poor choice of a name for a local variable (I'll rename it). This variable can get different values depending on performed operation : autovacuum_max_parallel_workers for parallel autovacuum and max_parallel_maintenance_workers for maintenance VACUUM. > > 3/ Shouldn't this be "max_parallel_workers" instead of "bgworkers pool" ? > > + "autovacuum_parallel_workers", > + "Maximum number of parallel autovacuum workers > that can be taken from bgworkers pool for processing this table. " > I don't think that we should refer to max_parallel_workers here. Actually, this reloption doesn't depend on max_parallel_workers at all. I wrote about bgworkers pool (both here and in description of autovacuum_max_parallel_workers parameter) in order to clarify that parallel autovacuum will use dynamic workers instead of launching more a/v workers. BTW, I don't really like that the comment on this option turns out to be very large. I'll leave only short description in reloptions.c and move clarification about zero value in rel.h Mentions of bgworkers pool will remain only in description of autovacuum_max_parallel_workers. > 4/ The comment "When parallel autovacuum worker die" suggests an abnormal > exit. "Terminates" seems clearer, since this applies to both normal and > abnormal exits. > > instead of: > + * When parallel autovacuum worker die, > > how about this: > * When parallel autovacuum worker terminates, > Sounds reasonable, I'll fix it. > > 5/ Any reason AutoVacuumReleaseParallelWorkers cannot be called before > DestroyParallelContext? > > + nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember > this value */ > DestroyParallelContext(pvs->pcxt); > + > + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ > + if (AmAutoVacuumWorkerProcess()) > + AutoVacuumReleaseParallelWorkers(nlaunched_workers); > I wrote about it above [1], but I think I can duplicate my thoughts here : """ Destroying parallel context includes waiting for all workers to exit (after which, other operations can use them). If we first call ParallelAutoVacuumReleaseWorkers, some operation can reasonably request all released workers. But this request can fail, because there is no guarantee that workers managed to finish. Actually, there's nothing wrong with that, but I think releasing workers only after finishing work is a more logical approach. """ > > 6/ Also, would it be cleaner to move AmAutoVacuumWorkerProcess() inside > AutoVacuumReleaseParallelWorkers()? > > if (!AmAutoVacuumWorkerProcess()) > return; > It seems to me that the opposite is true. If there is no alternative to calling AmAutoVacuumWorkerProcess, it might confuse somebody. All doubts will disappear after viewing the AmAutoVacuumWorkerProcess code, but IMO code in vacuumparallel.c will become less intuitive. > 7/ It looks like the psql tab completion for autovacuum_parallel_workers is > missing: > > test=# alter table t set (autovacuum_ > autovacuum_analyze_scale_factor > autovacuum_analyze_threshold > autovacuum_enabled > autovacuum_freeze_max_age > autovacuum_freeze_min_age > autovacuum_freeze_table_age > autovacuum_multixact_freeze_max_age > autovacuum_multixact_freeze_min_age > autovacuum_multixact_freeze_table_age > autovacuum_vacuum_cost_delay > autovacuum_vacuum_cost_limit > autovacuum_vacuum_insert_scale_factor > autovacuum_vacuum_insert_threshold > autovacuum_vacuum_max_threshold > autovacuum_vacuum_scale_factor > autovacuum_vacuum_threshold > Good catch, I'll fix it. Thank you for the review! Please, see v9 patches : 1) Run pgindent + rebase patches on newest commit in master. 2) Introduce changes for documentation. 3) Rename local variable in parallel_vacuum_compute_workers. 4) Shorten the description of autovacuum_parallel_workers in reloptions.c (move clarifications for it into rel.h). 5) Reword "When parallel autovacuum worker die" comment. 6) Add tab completion for autovacuum_parallel_workers table option. [1] https://www.postgresql.org/message-id/CAJDiXgi7KB7wSQ%3DUx%3DngdaCvJnJ5x-ehvTyiuZez%2B5uKHtV6iQ%40mail.gmail.com -- Best regards, Daniil Davydov
Attachment
pgsql-hackers by date: