Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: [HACKERS] Block level parallel vacuum |
Date | |
Msg-id | 20191227022401.ikepifssdes6apaw@development Whole thread Raw |
In response to | Re: [HACKERS] Block level parallel vacuum (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>) |
Responses |
Re: [HACKERS] Block level parallel vacuum
|
List | pgsql-hackers |
Hi, On Wed, Dec 25, 2019 at 09:17:16PM +0900, Masahiko Sawada wrote: >On Tue, 24 Dec 2019 at 15:46, Masahiko Sawada ><masahiko.sawada@2ndquadrant.com> wrote: >> >> On Tue, 24 Dec 2019 at 15:44, Amit Kapila <amit.kapila16@gmail.com> wrote: >> > >> > On Tue, Dec 24, 2019 at 12:08 PM Masahiko Sawada >> > <masahiko.sawada@2ndquadrant.com> wrote: >> > > >> > > >> > > The first patches look good to me. I'm reviewing other patches and >> > > will post comments if there is. >> > > >> >> Oops I meant first "two" patches look good to me. >> >> > >> > Okay, feel free to address few comments raised by Mahendra along with >> > whatever you find. >> >> Thanks! >> > >I've attached updated patch set as the previous version patch set >conflicts to the current HEAD. This patch set incorporated the review >comments, a few fix and the patch for >PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION. 0001 patch is the same >as previous version. > I've been reviewing the updated patches over the past couple of days, so let me share some initial review comments. I initially started to read the thread, but then I realized it's futile - the thread is massive, and the patch changed so much re-reading the whole thread is a waste of time. It might be useful write a summary of the current design, but AFAICS the original plan to parallelize the heap scan is abandoned and we now do just the steps that vacuum indexes in parallel. Which is fine, but it means the subject "block level parallel vacuum" is a bit misleading. Anyway, most of the logic is implemented in part 0002, which actually does all the parallel worker stuff. The remaining parts 0001, 0003 and 0004 are either preparing infrastructure or not directlyrelated to the primary feature. v40-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patch ----------------------------------------------------------- I wonder if 'amparallelvacuumoptions' is unnecessarily specific. Maybe it should be called just 'amvacuumoptions' or something like that? The 'parallel' part is actually encoded in names of the options. Also, why do we need a separate amusemaintenanceworkmem option? Why don't we simply track it using a separate flag in 'amvacuumoptions' (or whatever we end up calling it)? Would it make sense to track m_w_m usage separately for the two index cleanup phases? Or is that unnecessary / pointless? v40-0002-Add-a-parallel-option-to-the-VACUUM-command.patch ---------------------------------------------------------- I haven't found any issues yet, but I've only started with the code review. I'll continue with the review. It seems in a fairly good shape though, I think, I only have two minor comments at the moment: - The SizeOfLVDeadTuples macro seems rather weird. It does include space for one ItemPointerData, but we really need an array of them. But then all the places where the macro is used explicitly add space for the pointers, so the sizeof(ItemPointerData) seems unnecessary. So it should be either #define SizeOfLVDeadTuples (offsetof(LVDeadTuples, itemptrs)) or #define SizeOfLVDeadTuples(cnt) \ (offsetof(LVDeadTuples, itemptrs) + (cnt) * sizeof(ItemPointerData)) in which case the callers can be simplified. - It's not quite clear to me why we need the new nworkers_to_launch field in ParallelContext. v40-0003-Add-FAST-option-to-vacuum-command.patch ------------------------------------------------ I do have a bit of an issue with this part - I'm not quite convinved we actually need a FAST option, and I actually suspect we'll come to regret it sooner than later. AFAIK it pretty much does exactly the same thing as setting vacuum_cost_delay to 0, and IMO it's confusing to provide multiple ways to do the same thing - I do expect reports from confused users on pgsql-bugs etc. Why is setting vacuum_cost_delay directly not a sufficient solution? The same thing applies to the PARALLEL flag, added in 0002, BTW. Why do we need a separate VACUUM option, instead of just using the existing max_parallel_maintenance_workers GUC? It's good enough for CREATE INDEX so why not here? Maybe it's explained somewhere deep in the thread, of course ... v40-0004-Add-ability-to-disable-leader-participation-in-p.patch --------------------------------------------------------------- IMHO this should be simply merged into 0002. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: