Re: Parallel heap vacuum - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: Parallel heap vacuum |
Date | |
Msg-id | CAHut+Pt-QwmrWvoZfWpGyBYDam0ouAkRn9m1SSpJGHj-154H0w@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel heap vacuum (Masahiko Sawada <sawada.mshk@gmail.com>) |
List | pgsql-hackers |
Hi Sawada-san, Here are some review comments for the patch v16-0002 ====== Commit message 1. Missing period. /cleanup, ParallelVacuumState/cleanup. ParallelVacuumState/ ~~~ 2. Typo /paralel/parallel/ ~~~ 3. Heap table AM disables the parallel heap vacuuming for now, but an upcoming patch uses it. IIUC, this "disabling" was implemented in patch 0001, so maybe this comment also belongs in patch 0001. ====== src/backend/commands/vacuumparallel.c 4. * This file contains routines that are intended to support setting up, using, - * and tearing down a ParallelVacuumState. + * and tearing down a ParallelVacuumState. ParallelVacuumState contains shared + * information as well as the memory space for storing dead items allocated in + * the DSA area. We launch * What's the "We launch" incomplete sentence meant to say? ~~~ 5. +typedef enum +{ + PV_WORK_PHASE_PROCESS_INDEXES, /* index vacuuming or cleanup */ + PV_WORK_PHASE_COLLECT_DEAD_ITEMS, /* collect dead tuples */ +} PVWorkPhase; + AFAIK, these different "phases" are the ones already described in the vacuumlazy.c header comment -- there they are named as I, II, III. IMO it might be better to try to keep these phase enums together with the phase descriptions, as well as giving the values all names similar to those existing descriptions. Maybe something like: PHASE_I_VACUUM_RELATION_COLLECT_DEAD PHASE_II_VACUUM_INDEX_COLLECT_DEAD PHASE_III_VACUUM_REAP_DEAD ~~~ 6. -parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested, - bool *will_parallel_vacuum) +parallel_vacuum_compute_workers(Relation rel, Relation *indrels, int nindexes, + int nrequested, int *nworkers_table_p, + bool *idx_will_parallel_vacuum, void *state) I had asked this once before ([1] comment #5) IIUC the returns for this function seem inconsistent. AFAIK, it previously would return the number of workers for parallel index vacuuming. But now (after this patch) the return value is returned Max(nworkers_table, nworkers_index). Meanwhile, the number of workers for parallel table vacuuming is returned as a by-reference parameter 'nworkers_table_p'. In other words, it is returning the number of workers in 2 different ways. It would seem tidier to make this a void function, and introduce another parameter 'nworkers_index_p', similar to 'nworkers_table_p'. The caller can do the parallel_workers = Max(nworkers_table_p, nworkers_index_p); ~~~ parallel_vacuum_process_all_indexes 7. * to finish, or we might get incomplete data.) */ if (nworkers > 0) - { - /* Wait for all vacuum workers to finish */ - WaitForParallelWorkersToFinish(pvs->pcxt); - - for (int i = 0; i < pvs->pcxt->nworkers_launched; i++) - InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]); - } + parallel_vacuum_end_worke_phase(pvs); 7a. The code comment that precedes this was describing code that is now all removed (moved into the parallel_vacuum_end_worke_phase), so probably that comment needs to be changed. ~ 7b. Typo: parallel_vacuum_end_worke_phase (worke?) ~~~ parallel_vacuum_collect_dead_items_end: 8. + /* Wait for parallel workers to finish */ + parallel_vacuum_end_worke_phase(pvs); typo (worke) ~~~ 9. + /* Decrement the worker count for the leader itself */ + if (VacuumActiveNWorkers) + pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1); Is this OK? Or did the prior parallel_vacuum_end_worke_phase call already set VacuumActiveNWorkers to NULL. ~~~ parallel_vacuum_process_table: 10. + /* + * We have completed the table vacuum so decrement the active worker + * count. + */ + pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1); Is this OK? (Similar question to the previous review comment #9) Can the table_parallel_vacuum_collect_dead_items end up calling parallel_vacuum_end_worke_phase, so by time we get here the VacuumActiveNWorkers might be NULL? ~~~ parallel_vacuum_begin_work_phase: 11. +/* + * Launch parallel vacuum workers for the given phase. If at least one + * worker launched, enable the shared vacuum delay costing. + */ Maybe this comment should also say a call to this needs to be balanced by a call to the other function: parallel_vacuum_end_work_phase. ~~~ 12. +static void +parallel_vacuum_end_worke_phase(ParallelVacuumState *pvs) Typo (worke) ====== [1] https://www.postgresql.org/message-id/CAHut%2BPs9yTGk2TWdjgLQEzGfPTWafKT0H-Q6WvYh5UKNW0pCvQ%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: