Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id 20190319.191449.04094806.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: [HACKERS] Block level parallel vacuum
List pgsql-hackers
At Tue, 19 Mar 2019 19:01:06 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in
<CAD21AoA3PpkcNNzcQmiNgFL3DudhdLRWoTvQE6=kRagFLjUiBg@mail.gmail.com>
> On Tue, Mar 19, 2019 at 4:59 PM Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> >
> > At Tue, 19 Mar 2019 13:31:04 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in
<CAD21AoD4ivrYqg5tau460zEEcgR0t9cV-UagjJ997OfvP3gsNQ@mail.gmail.com>
> > > > For indexes=4,8,16, the cases with parallel_degree=4,8,16 behave
> > > > almost the same. I suspect that the indexes are too-small and all
> > > > the index pages were on memory and CPU is saturated. Maybe you
> > > > had four cores and parallel workers more than the number had no
> > > > effect.  Other normal backends should have been able do almost
> > > > nothing meanwhile. Usually the number of parallel workers is
> > > > determined so that IO capacity is filled up but this feature
> > > > intermittently saturates CPU capacity very under such a
> > > > situation.
> > > >
> > >
> > > I'm sorry I didn't make it clear enough. If the parallel degree is
> > > higher than 'the number of indexes - 1' redundant workers are not
> > > launched. So for indexes=4, 8, 16 the number of actually launched
> > > parallel workers is up to 3, 7, 15 respectively. That's why the result
> > > shows almost the same execution time in the cases where nindexes <=
> > > parallel_degree.
> >
> > In the 16 indexes case, the performance saturated at 4 workers
> > which contradicts to your explanation.
> 
> Because the machine I used has 4 cores the performance doesn't get
> improved even if more than 4 parallel workers are launched.

That is what I mentioned in the cited phrases. Sorry for perhaps
hard-to-read phrases.. 

> >
> > > I'll share the performance test result of more larger tables and indexes.
> > >
> > > > I'm not sure, but what if we do index vacuum in one-tuple-by-one
> > > > manner? That is, heap vacuum passes dead tuple one-by-one (or
> > > > buffering few tuples) to workers and workers process it not by
> > > > bulkdelete, but just tuple_delete (we don't have one). That could
> > > > avoid the sleep time of heap-scan while index bulkdelete.
> > > >
> > >
> > > Just to be clear, in parallel lazy vacuum all parallel vacuum
> > > processes including the leader process do index vacuuming, no one
> > > doesn't sleep during index vacuuming. The leader process does heap
> > > scan and launches parallel workers before index vacuuming. Each
> > > processes exclusively processes indexes one by one.
> >
> > The leader doesn't continue heap-scan while index vacuuming is
> > running. And the index-page-scan seems eat up CPU easily. If
> > index vacuum can run simultaneously with the next heap scan
> > phase, we can make index scan finishes almost the same time with
> > the next round of heap scan. It would reduce the (possible) CPU
> > contention. But this requires as the twice size of shared
> > memoryas the current implement.
> 
> Yeah, I've considered that something like pipe-lining approach that
> one process continue to queue the dead tuples and other process
> fetches and processes them during index vacuuming but the current
> version patch employed the most simple approach as the first step.
> Once we had the retail index deletion approach we might be able to use
> it for parallel vacuum.

Ok, I understood the direction.

...
> > > Sorry I couldn't get your comment. You meant to move nprocessed to
> > > LVParallelState?
> >
> > Exactly. I meant letting lvshared points to private memory, but
> > it might introduce confusion.
> 
> Hmm, I'm not sure it would be a good idea. It would introduce
> confusion as you mentioned. And since 'nprocessed' have to be
> pg_atomic_uint32 in parallel mode we will end up with having an
> another branch.

Ok. Agreed. Thank you for the pacience.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Proposal to suppress errors thrown by to_reg*()
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: [HACKERS] Block level parallel vacuum