Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: [HACKERS] Block level parallel vacuum |
Date | |
Msg-id | CAFiTN-v1sDzjWe9UuJd9ELwSzGWo098c5vK9EUXj6CoubK+jRA@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Block level parallel vacuum (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
On Thu, Jan 2, 2020 at 9:03 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Jan 2, 2020 at 8:29 AM Masahiko Sawada > <masahiko.sawada@2ndquadrant.com> wrote: > > > > On Tue, 31 Dec 2019 at 12:39, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Mon, Dec 30, 2019 at 6:46 PM Tomas Vondra > > > <tomas.vondra@2ndquadrant.com> wrote: > > > > > > > > On Mon, Dec 30, 2019 at 08:25:28AM +0530, Amit Kapila wrote: > > > > >On Mon, Dec 30, 2019 at 2:53 AM Tomas Vondra > > > > ><tomas.vondra@2ndquadrant.com> wrote: > > > > >> I think there's another question we need to ask - why to we introduce a > > > > >> bitmask, instead of using regular boolean struct members? Until now, the > > > > >> IndexAmRoutine struct had simple boolean members describing capabilities > > > > >> of the AM implementation. Why shouldn't this patch do the same thing, > > > > >> i.e. add one boolean flag for each AM feature? > > > > >> > > > > > > > > > >This structure member describes mostly one property of index which is > > > > >about a parallel vacuum which I am not sure is true for other members. > > > > >Now, we can use separate bool variables for it which we were initially > > > > >using in the patch but that seems to be taking more space in a > > > > >structure without any advantage. Also, using one variable makes a > > > > >code bit better because otherwise, in many places we need to check and > > > > >set four variables instead of one. This is also the reason we used > > > > >parallel in its name (we also use *parallel* for parallel index scan > > > > >related things). Having said that, we can remove parallel from its > > > > >name if we want to extend/use it for something other than a parallel > > > > >vacuum. I think we might need to add a flag or two for parallelizing > > > > >heap scan of vacuum when we enhance this feature, so keeping it for > > > > >just a parallel vacuum is not completely insane. > > > > > > > > > >I think keeping amusemaintenanceworkmem separate from this variable > > > > >seems to me like a better idea as it doesn't describe whether IndexAM > > > > >can participate in a parallel vacuum or not. You can see more > > > > >discussion about that variable in the thread [1]. > > > > > > > > > > > > > I don't know, but IMHO it's somewhat easier to work with separate flags. > > > > Bitmasks make sense when space usage matters a lot, e.g. for on-disk > > > > representation, but that doesn't seem to be the case here I think (if it > > > > was, we'd probably use bitmasks already). > > > > > > > > It seems like we're mixing two ways to design the struct unnecessarily, > > > > but I'm not going to nag about this any further. > > > > > > > > > > Fair enough. I see your point and as mentioned earlier that we > > > started with the approach of separate booleans, but later found that > > > this is a better way as it was easier to set and check the different > > > parallel options for a parallel vacuum. I think we can go back to > > > the individual booleans if we want but I am not sure if that is a > > > better approach for this usage. Sawada-San, others, do you have any > > > opinion here? > > > > If we go back to the individual booleans we would end up with having > > three booleans: bulkdelete, cleanup and conditional cleanup. I think > > making the bulkdelete option to a boolean makes sense but having two > > booleans for cleanup and conditional cleanup might be slightly odd > > because these options are exclusive. > > > > If we have only three booleans, then we need to check for all three to > conclude that a parallel vacuum is not enabled for any index. > Alternatively, we can have a fourth boolean to indicate that a > parallel vacuum is not enabled. And in the future, when we allow > supporting multiple workers for an index, we might need another > variable unless we can allow it for all types of indexes. This was my > point that having multiple variables for the purpose of a parallel > vacuum (for indexes) doesn't sound like a better approach than having > a single uint8 variable. > > > If we don't stick to have only > > booleans the having a ternary value for cleanup might be > > understandable but I'm not sure it's better to have it for only vacuum > > purpose. > > > > If we want to keep the possibility of extending it for other purposes, > then we can probably rename it to amoptions or something like that. > What do you think? I think it makes more sense to just keep it for the purpose of enabling/disabling parallelism in different phases. I am not sure that adding more options (which are not related to enabling parallelism in vacuum phases) to the same variable will make sense. So I think the current name is good for its purpose. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: