Re: Parallel heap vacuum - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Parallel heap vacuum |
Date | |
Msg-id | CAD21AoAtw4DUrxz=E1WWZ7gbc=8rMY_nxZ7XxV90xmZDaK8Fig@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel heap vacuum (Tomas Vondra <tomas@vondra.me>) |
List | pgsql-hackers |
On Mon, Dec 9, 2024 at 2:11 PM Tomas Vondra <tomas@vondra.me> wrote: > > Hi, > > Thanks for working on this. I took a quick look at this today, to do > some basic review. I plan to do a bunch of testing, but that's mostly to > get a better idea of what kind of improvements to expect - the initial > results look quite nice and sensible. Thank you for reviewing the patch! > A couple basic comments: > > 1) I really like the idea of introducing "compute_workers" callback to > the heap AM interface. I faced a similar issue with calculating workers > for index builds, because right now plan_create_index_workers is doing > that the logic works for btree, but really not brin etc. It didn't occur > to me we might make this part of the index AM ... Thanks. > > 2) I find it a bit weird vacuumlazy.c needs to include optimizer/paths.h > because it really has nothing to do with planning / paths. I realize it > needs the min_parallel_table_scan_size, but it doesn't seem right. I > guess it's a sign this bit of code (calculating parallel workers based > on log of relation size) should in some "shared" location. True. The same is actually true also for vacuumparallel.c. It includes optimizer/paths.h to use min_parallel_index_scan_size. > > 3) The difference in naming ParallelVacuumState vs. PHVState is a bit > weird. I suggest ParallelIndexVacuumState and ParallelHeapVacuumState to > make it consistent and clear. With the patch, since ParallelVacuumState is no longer dedicated for parallel index vacuuming we cannot rename them in this way. Both parallel table scanning/vacuuming and parallel index vacuuming can use the same ParallelVacuumState instance. The heap-specific necessary data for parallel heap scanning and vacuuming are stored in PHVState. > > 4) I think it would be good to have some sort of README explaining how > the parallel heap vacuum works, i.e. how it's driven by FSM. Took me a > while to realize how the workers coordinate which blocks to scan. +1. I will add README in the next version patch. > > 5) Wouldn't it be better to introduce the scan_stats (grouping some of > the fields in a separate patch)? Seems entirely independent from the > parallel part, so doing it separately would make it easier to review. > Also, maybe reference the fields through scan_stats->x, instead of > through vacrel->scan_stats->x, when there's the pointer. Agreed. > 6) Is it a good idea to move NewRelfrozenXID/... to the scan_stats? > AFAIK it's not a statistic, it's actually a parameter affecting the > decisions, right? Right. It would be better to move them to a separate struct or somewhere. > > 7) I find it a bit strange that heap_vac_scan_next_block() needs to > check if it's a parallel scan, and redirect to the parallel callback. I > mean, shouldn't the caller know which callback to invoke? Why should the > serial callback care about this? do_lazy_scan_heap(), the sole caller of heap_vac_scan_next_block(), is called in serial vacuum and parallel vacuum cases. I wanted to make heap_vac_scan_next_block() workable in both cases. I think it also makes sense to have do_lazy_scan_heap() calls either function depending on parallel scan being enabled. > > 8) It's not clear to me why do_lazy_scan_heap() needs to "advertise" the > current block. Can you explain? The workers' current block numbers are used to calculate the minimum block number where we've scanned so far. In serial scan case, we vacuum FSM of the particular block range for every VACUUM_FSM_EVERY_PAGES pages . On the other hand, in parallel scan case, it doesn't make sense to vacuum FSM in that way because we might not have processed some blocks in the block range. So the idea is to calculate the minimum block number where we've scanned so far and vacuum FSM of the range of consecutive already-scanned blocks. > > 9) I'm a bit confused why the code checks IsParallelWorker() in so many > places. Doesn't that mean the leader can't participate in the vacuum > like a regular worker? I used '!isParallelWorker()' for some jobs that should be done only by the leader process. For example, checking failsafe mode, vacuuming FSM etc. > > 10) I'm not quite sure I understand the comments at the end of > do_lazy_scan_heap - it says "do a cycle of vacuuming" but I guess that > means "index vacuuming", right? It means both index vacuuming and heap vacuuming. > And then it says "pause without invoking > index and heap vacuuming" but isn't the whole point of this block to do > that cleanup so that the TidStore can be discarded? Maybe I just don't > understand how the work is divided between the leader and workers ... The comment needs to be updated. But what the patch does is that when the memory usage of the shared TidStore reaches the limit, worker processes exit after updating the shared statistics, and then the leader invokes (parallel) index vacuuming and parallel heap vacuuming. Since the different number workers could be used for parallel heap scan, parallel index vacuuming, and parallel heap vacuuming, the leader process waits for all workers to finish at end of each phase. > 11) Why does GlobalVisState need to move to snapmgr.h? If I undo this > the patch still builds fine for me. Oh, I might have missed something. I'll check if it's really necessary. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: