Re: parallel vacuum comments - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: parallel vacuum comments |
Date | |
Msg-id | 20211104190003.l5547auul46k7d5v@alap3.anarazel.de Whole thread Raw |
In response to | Re: parallel vacuum comments (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: parallel vacuum comments
|
List | pgsql-hackers |
Hi, On 2021-11-01 10:44:34 +0900, Masahiko Sawada wrote: > On Sun, Oct 31, 2021 at 6:21 AM Andres Freund <andres@anarazel.de> wrote: > > But even though we have this space optimized bitmap thing, we actually need > > more memory allocated for each index, making this whole thing pointless. > > Right. But is better to change to use booleans? It seems very clearly better to me. We shouldn't use complicated stuff like #define SizeOfLVShared (offsetof(LVShared, bitmap) + sizeof(bits8)) #define GetSharedIndStats(s) \ ((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset)) #define IndStatsIsNull(s, i) \ (!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07)))) when there's reason / benefit. > > - Imo it's pretty confusing to have functions like > > lazy_parallel_vacuum_indexes() (in 13, renamed in 14) that "Perform index > > vacuum or index cleanup with parallel workers.", based on > > lps->lvshared->for_cleanup. > > Okay. We need to set lps->lvshared->for_cleanup to tell worker do > either index vacuum or index cleanup. So it might be better to pass > for_cleanup flag down to the functions in addition to setting > lps->lvshared->for_cleanup. Yup. > > - I don't like some of the new names introduced in 14. E.g. > > "do_parallel_processing" is way too generic. > > I listed the function names that probably needs to be renamed from > that perspecti: > > * do_parallel_processing > * do_serial_processing_for_unsafe_indexes > * parallel_process_one_index > > Is there any other function that should be renamed? parallel_processing_is_safe(). I don't like that there's three different naming patterns for parallel things. There's do_parallel_*, there's parallel_, and there's (begin|end|compute)_parallel_*. > > - On a higher level, a lot of this actually doesn't seem to belong into > > vacuumlazy.c, but should be somewhere more generic. Pretty much none of this > > code is heap specific. And vacuumlazy.c is large enough without the parallel > > code. > > I don't come with an idea to make them more generic. Could you > elaborate on that? To me the the job that the parallel vacuum stuff does isn't really specific to heap. Any table AM supporting indexes is going to need to do something pretty much like it (it's calling indexam stuff). Most of the stuff in vacuumlazy.c is very heap specific - you're not going to be able to reuse lazy_scan_heap() or such. Before the parallel vacuum stuff, the index specific code in vacuumlazy.c was fairly limited - but now it's a nontrivial amount of code. Based on a quick look parallel_vacuum_main(), parallel_processing_is_safe(), parallel_stats_for_idx(), end_parallel_vacuum(), begin_parallel_vacuum(), compute_parallel_vacuum_workers(), parallel_process_one_index(), do_serial_processing_for_unsafe_indexes(), do_parallel_processing(), do_parallel_vacuum_or_cleanup(), do_parallel_lazy_cleanup_all_indexes(), do_parallel_lazy_vacuum_all_indexes(), don't really belong in vacuumlazy.c. but should be in vacuum.c or a new file. Of course that requires a bit of work, because of the heavy reliance on LVRelState, but afaict there's not really an intrinsic need to use that. Greetings, Andres Freund
pgsql-hackers by date: