Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id CA+fd4k4UQ1cp4G5rhzdvU6=dcMYrSg2VVekkxh5K9Unudp4sXg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: [HACKERS] Block level parallel vacuum  (Amit Kapila <amit.kapila16@gmail.com>)
Re: [HACKERS] Block level parallel vacuum  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Wed, 13 Nov 2019 at 12:43, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Nov 12, 2019 at 5:31 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Tue, 12 Nov 2019 at 20:29, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Tue, Nov 12, 2019 at 4:04 PM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > On Mon, 11 Nov 2019 at 17:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > >
> > > > > On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > > I realized that v31-0006 patch doesn't work fine so I've attached the
> > > > > > updated version patch that also incorporated some comments I got so
> > > > > > far. Sorry for the inconvenience. I'll apply your 0001 patch and also
> > > > > > test the total delay time.
> > > > > >
> > > > > While reviewing the 0002, I got one doubt related to how we are
> > > > > dividing the maintainance_work_mem
> > > > >
> > > > > +prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
> > > > > +{
> > > > > + /* Compute the new maitenance_work_mem value for index vacuuming */
> > > > > + lvshared->maintenance_work_mem_worker =
> > > > > + (nindexes_mwm > 0) ? maintenance_work_mem / nindexes_mwm :
> > > > > maintenance_work_mem;
> > > > > +}
> > > > > Is it fair to just consider the number of indexes which use
> > > > > maintenance_work_mem?  Or we need to consider the number of worker as
> > > > > well.  My point is suppose there are 10 indexes which will use the
> > > > > maintenance_work_mem but we are launching just 2 workers then what is
> > > > > the point in dividing the maintenance_work_mem by 10.
> > > > >
> > > > > IMHO the calculation should be like this
> > > > > lvshared->maintenance_work_mem_worker = (nindexes_mwm > 0) ?
> > > > > maintenance_work_mem / Min(nindexes_mwm, nworkers)  :
> > > > > maintenance_work_mem;
> > > > >
> > > > > Am I missing something?
> > > >
> > > > No, I think you're right. On the other hand I think that dividing it
> > > > by the number of indexes that will use the mantenance_work_mem makes
> > > > sense when parallel degree > the number of such indexes. Suppose the
> > > > table has 2 indexes and there are 10 workers then we should divide the
> > > > maintenance_work_mem by 2 rather than 10 because it's possible that at
> > > > most 2 indexes that uses the maintenance_work_mem are processed in
> > > > parallel at a time.
> > > >
> > > Right, thats the reason I suggested divide with Min(nindexes_mwm, nworkers).
> >
> > Thanks! I'll fix it in the next version patch.
> >
> One more comment.
>
> +lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
> + int nindexes, IndexBulkDeleteResult **stats,
> + LVParallelState *lps)
> +{
> + ....
>
> + if (ParallelVacuumIsActive(lps))
> + {
>
> +
> + lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
> + stats, lps);
> +
> + }
> +
> + for (idx = 0; idx < nindexes; idx++)
> + {
> + /*
> + * Skip indexes that we have already vacuumed during parallel index
> + * vacuuming.
> + */
> + if (ParallelVacuumIsActive(lps) && !IndStatsIsNull(lps->lvshared, idx))
> + continue;
> +
> + lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
> +   vacrelstats->old_live_tuples);
> + }
> +}
>
> In this function, if ParallelVacuumIsActive, we perform the parallel
> vacuum for all the index for which parallel vacuum is supported and
> once that is over we finish vacuuming remaining indexes for which
> parallel vacuum is not supported.  But, my question is that inside
> lazy_parallel_vacuum_or_cleanup_indexes, we wait for all the workers
> to finish their job then only we start with the sequential vacuuming
> shouldn't we start that immediately as soon as the leader
> participation is over in the parallel vacuum?

If we do that, while the leader process is vacuuming indexes that
don't not support parallel vacuum sequentially some workers might be
vacuuming for other indexes. Isn't it a problem? If it's not problem,
I think we can tie up indexes that don't support parallel vacuum to
the leader and do parallel index vacuum.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Invisible PROMPT2
Next
From: Amit Kapila
Date:
Subject: Re: [HACKERS] Block level parallel vacuum