Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: [HACKERS] Block level parallel vacuum |
Date | |
Msg-id | CAFiTN-v2eG89rp7zmUD+=gaRWqV9Jn48KNR1tkAUzJE=7RWx+Q@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Block level parallel vacuum (Dilip Kumar <dilipbalaut@gmail.com>) |
List | pgsql-hackers |
On Fri, Jan 17, 2020 at 9:36 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Jan 16, 2020 at 5:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Jan 16, 2020 at 4:46 PM Masahiko Sawada > > <masahiko.sawada@2ndquadrant.com> wrote: > > > > > > Right. Most indexes (all?) of tables that are used in the regression > > > tests are smaller than min_parallel_index_scan_size. And we set > > > min_parallel_index_scan_size to 0 in vacuum.sql but VACUUM would not > > > be speeded-up much because of the relation size. Since we instead > > > populate new table for parallel vacuum testing the regression test for > > > vacuum would take a longer time. > > > > > > > Fair enough and I think it is good in a way that it won't change the > > coverage of existing vacuum code. I have fixed all the issues > > reported by Mahendra and have fixed a few other cosmetic things in the > > attached patch. > > > I have few small comments. > > 1. > logical streaming for large in-progress transactions+ > + /* Can't perform vacuum in parallel */ > + if (parallel_workers <= 0) > + { > + pfree(can_parallel_vacuum); > + return lps; > + } > > why are we checking parallel_workers <= 0, Function > compute_parallel_vacuum_workers only returns 0 or greater than 0 > so isn't it better to just check if (parallel_workers == 0) ? > > 2. > +/* > + * Macro to check if we are in a parallel vacuum. If true, we are in the > + * parallel mode and the DSM segment is initialized. > + */ > +#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL) > > (LVParallelState *) (lps) -> this typecast is not required, just (lps) > != NULL should be enough. > > 3. > > + shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes))); > + prepare_index_statistics(shared, can_parallel_vacuum, nindexes); > + pg_atomic_init_u32(&(shared->idx), 0); > + pg_atomic_init_u32(&(shared->cost_balance), 0); > + pg_atomic_init_u32(&(shared->active_nworkers), 0); > > I think it will look cleaner if we can initialize in the order they > are declared in structure. > > 4. > + VacuumSharedCostBalance = &(lps->lvshared->cost_balance); > + VacuumActiveNWorkers = &(lps->lvshared->active_nworkers); > + > + /* > + * Set up shared cost balance and the number of active workers for > + * vacuum delay. > + */ > + pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance); > + pg_atomic_write_u32(VacuumActiveNWorkers, 0); > + > + /* > + * The number of workers can vary between bulkdelete and cleanup > + * phase. > + */ > + ReinitializeParallelWorkers(lps->pcxt, nworkers); > + > + LaunchParallelWorkers(lps->pcxt); > + > + if (lps->pcxt->nworkers_launched > 0) > + { > + /* > + * Reset the local cost values for leader backend as we have > + * already accumulated the remaining balance of heap. > + */ > + VacuumCostBalance = 0; > + VacuumCostBalanceLocal = 0; > + } > + else > + { > + /* > + * Disable shared cost balance if we are not able to launch > + * workers. > + */ > + VacuumSharedCostBalance = NULL; > + VacuumActiveNWorkers = NULL; > + } > + > > I don't like the idea of first initializing the > VacuumSharedCostBalance with lps->lvshared->cost_balance and then > uninitialize if nworkers_launched is 0. > I am not sure why do we need to initialize VacuumSharedCostBalance > here? just to perform pg_atomic_write_u32(VacuumSharedCostBalance, > VacuumCostBalance);? > I think we can initialize it only if nworkers_launched > 0 then we can > get rid of the else branch completely. I missed one of my comment + /* Carry the shared balance value to heap scan */ + if (VacuumSharedCostBalance) + VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance); + + if (nworkers > 0) + { + /* Disable shared cost balance */ + VacuumSharedCostBalance = NULL; + VacuumActiveNWorkers = NULL; + } Doesn't make sense to keep them as two conditions, we can combine them as below /* If shared costing is enable, carry the shared balance value to heap scan and disable the shared costing */ if (VacuumSharedCostBalance) { VacuumCostBalance = pg_atomic_read_u32(VacuumSharedCostBalance); VacuumSharedCostBalance = NULL; VacuumActiveNWorkers = NULL; } -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: