Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id CAFiTN-v1sDzjWe9UuJd9ELwSzGWo098c5vK9EUXj6CoubK+jRA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Thu, Jan 2, 2020 at 9:03 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jan 2, 2020 at 8:29 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Tue, 31 Dec 2019 at 12:39, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Dec 30, 2019 at 6:46 PM Tomas Vondra
> > > <tomas.vondra@2ndquadrant.com> wrote:
> > > >
> > > > On Mon, Dec 30, 2019 at 08:25:28AM +0530, Amit Kapila wrote:
> > > > >On Mon, Dec 30, 2019 at 2:53 AM Tomas Vondra
> > > > ><tomas.vondra@2ndquadrant.com> wrote:
> > > > >> I think there's another question we need to ask - why to we introduce a
> > > > >> bitmask, instead of using regular boolean struct members? Until now, the
> > > > >> IndexAmRoutine struct had simple boolean members describing capabilities
> > > > >> of the AM implementation. Why shouldn't this patch do the same thing,
> > > > >> i.e. add one boolean flag for each AM feature?
> > > > >>
> > > > >
> > > > >This structure member describes mostly one property of index which is
> > > > >about a parallel vacuum which I am not sure is true for other members.
> > > > >Now, we can use separate bool variables for it which we were initially
> > > > >using in the patch but that seems to be taking more space in a
> > > > >structure without any advantage.  Also, using one variable makes a
> > > > >code bit better because otherwise, in many places we need to check and
> > > > >set four variables instead of one.  This is also the reason we used
> > > > >parallel in its name (we also use *parallel* for parallel index scan
> > > > >related things).  Having said that, we can remove parallel from its
> > > > >name if we want to extend/use it for something other than a parallel
> > > > >vacuum.  I think we might need to add a flag or two for parallelizing
> > > > >heap scan of vacuum when we enhance this feature, so keeping it for
> > > > >just a parallel vacuum is not completely insane.
> > > > >
> > > > >I think keeping amusemaintenanceworkmem separate from this variable
> > > > >seems to me like a better idea as it doesn't describe whether IndexAM
> > > > >can participate in a parallel vacuum or not.  You can see more
> > > > >discussion about that variable in the thread [1].
> > > > >
> > > >
> > > > I don't know, but IMHO it's somewhat easier to work with separate flags.
> > > > Bitmasks make sense when space usage matters a lot, e.g. for on-disk
> > > > representation, but that doesn't seem to be the case here I think (if it
> > > > was, we'd probably use bitmasks already).
> > > >
> > > > It seems like we're mixing two ways to design the struct unnecessarily,
> > > > but I'm not going to nag about this any further.
> > > >
> > >
> > > Fair enough.  I see your point and as mentioned earlier that we
> > > started with the approach of separate booleans, but later found that
> > > this is a better way as it was easier to set and check the different
> > > parallel options for a parallel vacuum.   I think we can go back to
> > > the individual booleans if we want but I am not sure if that is a
> > > better approach for this usage.  Sawada-San, others, do you have any
> > > opinion here?
> >
> > If we go back to the individual booleans we would end up with having
> > three booleans: bulkdelete, cleanup and conditional cleanup. I think
> > making the bulkdelete option to a boolean makes sense but having two
> > booleans for cleanup and conditional cleanup might be slightly odd
> > because these options are exclusive.
> >
>
> If we have only three booleans, then we need to check for all three to
> conclude that a parallel vacuum is not enabled for any index.
> Alternatively, we can have a fourth boolean to indicate that a
> parallel vacuum is not enabled.  And in the future, when we allow
> supporting multiple workers for an index, we might need another
> variable unless we can allow it for all types of indexes.  This was my
> point that having multiple variables for the purpose of a parallel
> vacuum (for indexes) doesn't sound like a better approach than having
> a single uint8 variable.
>
> > If we don't stick to have only
> > booleans the having a ternary value for cleanup might be
> > understandable but I'm not sure it's better to have it for only vacuum
> > purpose.
> >
>
> If we want to keep the possibility of extending it for other purposes,
> then we can probably rename it to amoptions or something like that.
> What do you think?
I think it makes more sense to just keep it for the purpose of
enabling/disabling parallelism in different phases.  I am not sure
that adding more options (which are not related to enabling
parallelism in vacuum phases) to the same variable will make sense.
So I think the current name is good for its purpose.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [PATCH] Fix parallel query doc typos
Next
From: Alvaro Herrera
Date:
Subject: Re: TRUNCATE on foreign tables