Re: POC: Parallel processing of indexes in autovacuum - Mailing list pgsql-hackers
| From | Masahiko Sawada |
|---|---|
| Subject | Re: POC: Parallel processing of indexes in autovacuum |
| Date | |
| Msg-id | CAD21AoAvZc6Rwi1hZ7x+U3vz7AMMSpcbQ2JBn6+WmQp-3yfKMg@mail.gmail.com Whole thread Raw |
| In response to | Re: POC: Parallel processing of indexes in autovacuum (Daniil Davydov <3danissimo@gmail.com>) |
| Responses |
Re: POC: Parallel processing of indexes in autovacuum
|
| List | pgsql-hackers |
On Tue, Mar 31, 2026 at 7:18 AM Daniil Davydov <3danissimo@gmail.com> wrote: > > Hi, > > On Tue, Mar 31, 2026 at 2:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > I've made some changes to the documentation part, merged two patches > > into one, and updated the commit message. Please review the attached > > patch. > > > > Great, thank you very much! > > Again, I don't know how to write the documentation well, so you can ignore > my comments : > > > + <command>VACUUM</command> can perform index vacuuming and index cleanup > Don't we need to mention autovacuum here too? I thought that VACUUM in the > context means "manual VACUUM command". I think that the documentation explains that the autovacuum daemon is a worker automatically executing VACUUM and ANALYZE commands. > > > + ...applies specifically to the index vacuuming and index cleanup phases... > Maybe we can refer to "vacuum-phases" here? Agreed. > > All other changes look good to me. > > !!! > > Searching for arguments in > > favor of opt-in style, I asked for help from another person who has been > > managing the setup of highload systems for decades. He promised to share his > > opinion next week. > > I talked to Anton Doroshkevich today. Thank you for sharing! > He confirmed that as a rule there are *hundreds of thousands* of tables in the > system, the vast majority of which do not need to be vacuumed in parallel mode. I'm still struggling to see the technical justification; why would a user want to avoid parallel vacuuming on eligible tables if they have already explicitly allowed the system to use more resources by setting autovacuum_max_parallel_workers to >0? If resource contention occurs, it is typically a sign that the global parameters need re-tuning. As I mentioned, the same contention can occur even with an opt-in style if multiple tables are manually configured. Also, I'm concerned that opt-in style could confuse users since parallel vacuum is enabled by default in VACUUM command. > He also suggested the following : let the reloption overlap the value of the > GUC parameter. I.e. even if av_max_parallel_workers parameters is 0 the user > still can set the av_parallel_workers to 10 for some table, and autovacuum > will process this table in parallel. > > I remember that you want to use the GUC parameter as a global switch, and this > approach will break this logic. But according to Anton's words, it is okay if > the GUC parameter cannot disable parallel a/v for all tables instantly. It will > become an administrator's responsibility to manually turn off parallel a/v for > several tables (again, it is completely OK). Thus, this feature can be handy > for all use cases. While some autovacuum parameters do override GUCs, those are typically local to the process (like cost delay). Parallel workers, however, are a shared system-wide resource. In a multi-tenant environment, allowing a single table's reloption to bypass the autovacuum_max_parallel_workers = 0 limit could lead to unexpected exhaustion of the worker pool. I think that this GUC should act as a reliable global switch for resource management. > I hope it doesn't look like as an adapting to the needs of a specific user. > A lot of super-large productions are migrating to postgres now, and I believe > that we should ensure their comfort too. I'm not prioritizing one specific use case over another. I believe that there are also users who want to use parallel vacuum on hundreds of thousands of tables. We should consider a better solution while checking it from multiple perspectives such as the usability, the robustness and consistency with the existing features and behaviors etc. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: